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VECTORS FOR STABLE GENE EXPRESSION 

This application claims priority to U.S. Provisional Application Nos. 60/629,148, 
filed November 18, 2004, and 60/632,701, filed December 2, 2004. Each of which are 
5 incorporated herein by reference. 

FIELD OF THE INVENTION 

The present invention relates to expression vectors capable of promoting transgene 
expression. The expression vectors include site specific recombination elements, insulator 
10 elements, and recombinase coding sequences. In particular, the present invention provides 
methods for obtaining specific and stable integration of nucleic acids into eukaryotic cells 
through site specific recombination. 

BACKGROUND OF THE INVENTION 

15 Genetic transformation of eukaryotes often suffers from significant shortcomings. 

For example, it is often difficult to reproducibly obtain integration of a transgene at a 
particular locus of interest. Homologous recombination generally occurs only at a very low 
frequency. To overcome this problem, site-specific recombination systems are employed. 
Site specific recombination generally involves the use of recombinant sites and recombinase 

20 proteins. 

The Cre-lox system of bacteriophage PI, and the FLP-FRT system of see e,g., 
Saccharomyces cerevisiae are widely used for transgene and chromosome engineering in 
animals and plants (see, e.g., Sauer (1994) Curr. Opin. Biotechnol. 5: 521-527; Ow (1996) 
Curr. Opin. Biotechnol. 7: 181-186). Other systems that operate in animal or plant cells 

25 include the following: 1) the R-RS system from Zygosaccharomyces rouxii (see e,g., 
Onouchi et al. (1995) Mol. Gen. Genet. 247: 653-660), 2) the Gin-gix system from 
bacteriophage Mu (see e f g., Maeser & Kallmann (1991) Mol. Gen. Genet. 230: 170-176) 
and, 3) the p-recombinase-six system from bacterial plasmid pSM19035 (see e,g., Diaz et 
al. (1999) J. Biol, Chem. 274: 6634-6640). By using the site-specific recombinases, one can 

30 obtain a greater frequency and specificity of integration. 

However, these systems suffer from a significant shortcoming. Each of these 
systems have in common the property that a single polypeptide recombinase catalyzes the 
recombination between two sites of identical or nearly identical sequences. The product- 
sites generated by recombination are themselves substrates for subsequent recombination. 
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Consequently, recombination reactions are readily reversible. Since the kinetics of 
intramolecular interactions are favored over intermolecular interactions, these 
recombination systems are efficient for deleting rather than integrating DNA. 

An additional problem with the expression of foreign genes in eukaryotic cells is the 
5 clonal variation in the expression of the same gene in independent transfomiants: a problem 
referred to as "position effect" variation. No completely satisfactory method of obviating 
this problem has yet been developed. 

Thus, a need exists for methods and systems for obtaining stable site-specific 
integration of genes of interest. Additionally, a need exists for reducing position effect 
10 variation. 

SUMMARY OF THE INVENTION 

The present invention relates to expression vectors capable of promoting transgene 
expression. The expression vectors include site specific recombination elements, insulator 

1 5 elements, and recombinase coding sequences. In particular, the present invention provides 
methods for obtaining specific and stable integration of nucleic acids into eukaryotic cells 
through site specific recombination. 

In certain embodiments, the present invention provides an expression vector 
comprising a promoter, a transgene, a site specific recombination site, and an insulator 

20 element(s). In other embodiments, the expression vector contains a promoter, a site specific 
recombination site, and an insulator element(s), and a restriction enzyme site for insertion of 
a transgene of interest. The invention further provides a second expression vector that 
encodes a recombinase protein capable of catalyzing the integration of the first expression 
vector into a host cell genome, hi some embodiments, the promoter is an epidennal cell 

25 specific promoter, while in further embodiments, the promoter is a keratinocyte specific 

promoter. In preferred embodiments, the promoter is a keratin- 5 (K5), involucrin (ESTV), or 
keratin- 14 (K14) promoter. In other embodiments, the transgene is VEGF. In other 
embodiments the transgene is KGF-2, In preferred embodiments, the site specific 
recombination site is attB. In preferred embodiments, the insulator element is HS-4. In 

30 further embodiments, the HS-4 is an HS-4 dimer. 

In some embodiments, the recombinase element is selected from the group 
consisting of a bacteriophage <j)C3 1 integrase, a coliphage P4 recombinase, a Listeria phage 
recombinase, a bacteriophage R4 Sre recombinase, a CisA recombinase, an XisF 
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recombinase, and a transposon Tn445 1 TnpX recombinase. In preferred embodiments, the 
recombinase element is a <|>C3 1 integrase. 

In prefeixed embodiments, the recombinase coding sequence is operably linked to 
the promoter in the second expression vector. In preferred embodiments, the transgene is 
5 operably linked to the promoter. 

In certain embodiments, the present invention provides an expression vector 
comprising a promoter, a gene of interest, a site specific recombination site, and an insulator 
element. In preferred embodiments, the promoter is a keratinocyte promoter. In further 
embodiments, the promoter is K-14. In preferred embodiments, the transgene is a 

10 transgene. In further embodiments, the transgene is VEGF. In still further embodiments, 
the transgene is KGF-2. In preferred embodiments, the site specific recombination site is 
attB. In preferred embodiments, the insulator element is HS-4. In further embodiments the 
HS-4 is an HS-4 dimer. 

hi some preferred embodiments, an additional expression vector comprising a 

1 5 recombinase element is provided. In some embodiments, the recombinase element is 
selected from the group consisting of a bacteriophage <|>C3 1 integrase, a coliphage P4 
recombinase, a Listeria phage recombinase, a bacteriophage R4 Sre recombinase, a CisA 
recombinase, an XisF recombinase, and a transposon Tn4451 TnpX recombinase. In 
prefen*ed embodiments, the recombinase element is a tyC3 1 integrase. 

20 hi preferred embodiments, the recombinase element is operably linked to a 

promoter. In some embodiments, the promoter is an epidermal cell specific promoter, hi 
further preferred embodiments, the promoter is a keratinocyte specific promoter. 



DESCRIPTION OF THE FIGURES 

25 Figure 1 provides the consensus sequence of the K14 promoter (SEQ ID NO: 1). 

Figure 2 provides the consensus sequence for the involucrin promoter (SEQ ID NO: 

2). 

Figure 3 shows an expression vector of the present invention. 
Figure 4 provides the consensus sequence for VEGF (SEQ ID NO: 3). 
30 Figure 5 provides a full length HS4 Insulator sequence (SEQ ID NO: 4), the HS4 

core sequence (SEQ ID NO:. 5), and the HS4 dimer sequence (SEQ ID NO: 6). 
Figure 6 provides an attB recombination site sequence (SEQ ID NO: 7). 
Figure 7 provides a complete vector sequence (SEQ ID NO: 8). 
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DEFINITIONS 

To facilitate understanding of the invention, a number of terms are defined below. 
As used herein, the tenn "insulator elements/ 5 "insulator borders," and related terms 
5 refer to chromosomal elements capable of hindering the effect of transcriptional enhancers 
on promoters, and protect the transcription of transgenes from both positive and negative 
chromosomal position effect variegation. Examples of insulator elements include, but are 
not limited to, HS2, HS3, and HS4. The terms "HS2", "HS3" and "HS4" refer to full-length 
insulator elements as well as elements that are derived from the full length insulator 
10 elements such as fragments of the insulator elements (e.g., HS4 fragments as exemplified 
herein). 

As used herein, the term "growth factor" refers to extracellular molecules that bind 
to a cell-surface triggering an intracellular signaling pathway leading to proliferation, 
differentiation, or other cellular response. Examples of growth factors include, but are not 
1 5 limited to, growth factor I, trophic factor, Ca 2+ , insulin, hormones, synthetic molecules, 
pharmaceutical agents, and LDL. 

As used herein, the term f 'keratinocyte growth factor" or "KGF" refers to a member 
of a group of structurally distinct proteins known as FGFs that display varying degrees of 
sequence homology, suggesting that they are encoded by a related family of genes. The 
20 FGFs share common receptor sites on cell surfaces. KGF, for example, can bind to FGFR-3. 

As used herein, the term "NIKS cells" refers to cells having the characteristics of the 
cells deposited as cell line ATCC CRL-12191. 

The tenn "gene" refers to a nucleic acid (e.g., DNA) sequence that comprises coding 

25 sequences necessary for the production of a polypeptide, RNA or precursor. The 

polypeptide, RNA, or precursor can be encoded by a full length coding sequence or by any 
portion of the coding sequence so long as the desired activity or functional properties (e.g., 
enzymatic activity, ligand binding, signal transduction, etc.) of the full-length or fragment 
are retained. The tenn also encompasses the coding region of a structural gene and the 

30 sequences located adjacent to the coding region on both the 5' and 3' ends for a distance of 
about 1 kb on either end such that the gene corresponds to the length of the full-length 
mRNA. The sequences that are located 5' of the coding region and which are present on the 
mRNA are referred to as 5 f untranslated sequences. The sequences that are located 3' or 
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downstream of the coding region and that are present on the mRNA are referred to as 3 1 
untranslated sequences. The term "gene" encompasses both cDNA and genomic forms of a 
gene. A genomic form or clone of a gene contains the coding region interrupted with non- 
coding sequences termed "introns" or "intervening regions" or "intervening sequences." 
5 Introns are segments of a gene that are transcribed into nuclear RNA (hnRNA); introns may 
contain regulatory elements such as enhancers. Introns are removed or ''spliced out" from 
the nuclear or primary transcript; introns therefore are absent in the messenger RNA 
(mRNA) transcript. The mRNA functions during translation to specify the sequence or 
order of amino acids in a nascent polypeptide. 
10 The term "recombinase" refers to an enzyme that catalyzes recombination between 

two or more recombination sites. Recombinases useful in the present invention catalyze 
recombination at specific recombination sites which are specific polynucleotide sequences 
that are recognized by a particular recombinase. The term "integrase" refers to a type of 
recombinase. 

15 The terms "recombination elements" and "recombination sites" refer to specific 

polynucleotide sequences that are recognized by the recombinase enzymes described herein. 
Typically, two different sites are involved (termed "complementary sites"), one present in 
the target nucleic acid (e.g., a chromosome or episome of a eukaryote) and another on the 
nucleic acid that is to be integrated at the target recombination site. The terms "attB," 

20 "attP," "attL," and "attR" which refer to attachment (or recombination) sites originally from 
a bacterial target and a phage donor, respectively, are used herein although recombination 
sites for particular enzymes may have different names. Recombination elements which 
share sequence or functional similarity to the bacterial/phage recombination sites are present 
in mammalian genomes and are also defined as recombination elements herein. 

25 Where "amino acid sequence" is recited herein to refer to an amino acid sequence of 

a naturally occurring protein molecule, "amino acid sequence" and like terms, such as 
"polypeptide" or "protein" are not meant to limit the amino acid sequence to the complete, 
native amino acid sequence associated with the recited protein molecule. 

In addition to containing introns, genomic forms of a gene may also include 

30 sequences located on both the 5' and 3 f end of the sequences that are present on the RNA 
transcript. These sequences are referred to as "flanking" sequences or regions (these 
flanking sequences are located 5* or 3 f to the non-translated sequences present on the mRNA 
transcript). The 5 f flanking region may contain regulatory sequences such as promoters and 
enhancers that control or influence the transcription of the gene. The 3 f flanking region may 



WO 2006/055931 



PCT/US2005/042219 



contain sequences that direct the termination of transcription, post-transcriptional cleavage 
and polyadenylation. 

As used herein, the terms "nucleic acid molecule encoding, 11 "DNA sequence 
encoding," and "DNA encoding" refer to the order or sequence of deoxyribonucleotides 
5 along a strand of deoxyribonucleic acid. The order of these deoxyribonucleotides 
determines the order of amino acids along the polypeptide (protein) chain. The DNA 
sequence thus codes for the amino acid sequence. 

DNA molecules are said to have "5* ends" and "3* ends" because mononucleotides 
are reacted to make oligonucleotides or polynucleotides in a manner such that the 5' 

10 phosphate of one mononucleotide pentose ring is attached to the 3' oxygen of its neighbor in 
one direction via a phosphodiester linkage. Therefore, an end of an oligonucleotides or 
polynucleotide, referred to as the "5* end" if its 5' phosphate is not linked to the 3' oxygen of 
a mononucleotide pentose ring and as the "3 ? end" if its 3' oxygen is not linked to a 5 1 
phosphate of a subsequent mononucleotide pentose ring. As used herein, a nucleic acid 

1 5 sequence, even if internal to a larger oligonucleotide or polynucleotide, also may be said to 
have 5' and 3 f ends. In either a linear or circular DNA molecule, discrete elements are 
referred to as being "upstream" or 5' of the "downstream" or 3 f elements. This terminology 
reflects the fact that transcription proceeds in a 5' to 3 f fashion along the DNA strand. The 
promoter and enhancer elements that direct transcription of a linked gene are generally 

20 located 5' or upstream of the coding region. However, enhancer elements can exert their 
effect even when located 3' of the promoter element and the coding region. Transcription 
termination and polyadenylation signals are located 3 f or downstream of the coding region. 

As used herein, the terms "an oligonucleotide having a nucleotide sequence 
encoding a gene" and "polynucleotide having a nucleotide sequence encoding a gene," 

25 means a nucleic acid sequence comprising the coding region of a gene or, in other words, 

the nucleic acid sequence that encodes a gene product. The coding region may be present in 
a cDNA, genomic DNA, or RNA form. When present in a DNA form, the oligonucleotide 
or polynucleotide maybe single-stranded (i.e., the sense strand) or double-stranded. 
Suitable control elements such as enhancers/promoters, splice junctions, polyadenylation 

30 signals, etc. may be placed in close proximity to the coding region of the gene if needed to 
permit proper initiation of transcription and/or correct processing of the primary RNA 
transcript. Alternatively, the coding region utilized in the expression vectors of the present 
invention may contain endogenous enhancers/promoters, splice junctions, intervening 
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sequences, polyadenylation signals, etc. or a combination of both endogenous and 
exogenous control elements. 

As used herein, the term "regulatory element" refers to a genetic element that 
controls some aspect of the expression of nucleic acid sequences. For example, a promoter 
5 is a regulatory element that facilitates the initiation of transcription of an operably linked 
coding region. Other regulatory elements include splicing signals, polyadenylation signals, 
termination signals, etc. 

As used herein, the terms "complementary" or "complementarity" are used in 
reference to polynucleotides (i.e., a sequence of nucleotides) related by the base-pairing 

10 rules. For example, for the sequence 5'-"A-G-T-3V f is complementary to the sequence 3- 
"T-C-A-5V Complementarity may be "partial," in which only some of the nucleic acids' 
bases are matched according to the base pairing rules. Or, there may be "complete" or 
"total" complementarity between the nucleic acids. The degree of complementarity 
between nucleic acid strands has significant effects on the efficiency and strength of 

1 5 hybridization between nucleic acid strands. This is of particular importance in amplification 
reactions, as well as detection methods that depend upon binding between nucleic acids. 
Complementarity can include the formation of base pairs between any type of nucleotides, 
including non-natural bases, modified bases, synthetic bases and the like. 

The term "homology" refers to a degree of complementarity. There may be partial 

20 homology or complete homology (i.e., identity). A partially complementary sequence is 

one that at least partially inhibits a completely complementary sequence from hybridizing to 
a target nucleic acid and is referred to using the functional term "substantially homologous." 
The term "inhibition of binding," when used in reference to nucleic acid binding, refers to 
inhibition of binding caused by competition of homologous sequences for binding to a 

25 target sequence. The inhibition of hybridization of the completely complementary sequence 
to the target sequence may be examined using a hybridization assay (Southern or Northern 
blot, solution hybridization and the like) under conditions of low stringency. A 
substantially homologous sequence or probe will compete for and inhibit the binding (i.e., 
the hybridization) of a completely homologous to a target under conditions of low 

30 stringency. This is not to say that conditions of low stringency are such that non-specific 
binding is permitted; low stringency conditions require that the binding of two sequences to 
one another be a specific (i.e., selective) interaction. The absence of non-specific binding 
may be tested by the use of a second target that lacks even a partial degree of 
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complementarity (e.g., less than about 30% identity); in the absence of non-specific binding 
the probe will not hybridize to the second non-complementary target. 

The art knows well that numerous equivalent conditions may be employed to 
comprise low stringency conditions; factors such as the length and nature (DNA, RNA, base 
5 composition) of the probe and nature of the target (DNA, RNA, base composition, present 
in solution or immobilized, etc.) and the concentration of the salts and other components 
(e.g. 9 the presence or absence of formamide, dextran sulfate, polyethylene glycol) are 
considered and the hybridization solution may be varied to generate conditions of low 
stringency hybridization different from, but equivalent to, the above listed conditions. In 

10 addition, the art knows conditions that promote hybridization under conditions of high 

stringency (e.g., increasing the temperature of the hybridization and/or wash steps, the use 
of formamide in the hybridization solution, etc.). 

When used in reference to a double-stranded nucleic acid sequence such as a cDNA 
or genomic clone, the term "substantially homologous" refers to any probe that can 

1 5 hybridize to either or both strands of the double-stranded nucleic acid sequence under 
conditions of low stringency as described above. 

When used in reference to a single-stranded nucleic acid sequence, the term 
"substantially homologous" refers to any probe that can hybridize (i.e., it is the complement 
of) the single-stranded nucleic acid sequence under conditions of low stringency as 

20 described above. 

The term "fragment" as used herein refers to a polypeptide that has an amino- 
terminal and/or carboxy-terminal deletion as compared to the native protein, but where the 
remaining amino acid sequence is identical to the corresponding positions in the amino acid 
sequence deduced from a full-length cDNA sequence. Fragments typically are at least 4 

25 amino acids long, preferably at least 20 amino acids long, usually at least 50 amino acids 
long or longer, and span the portion of the polypeptide required for intermolecular binding 
of the compositions (claimed in the present invention) with its various ligands and/or 
substrates. 

As used herein, the terms "restriction endonucleases" and "restriction enzymes" refer 
30 to bacterial enzymes, each of which cut double-stranded DNA at or near a specific 
nucleotide sequence. 

As used herein, the term "recombinant DNA molecule" as used herein refers to a 
DNA molecule that is comprised of segments of DNA joined together by means of 
molecular biological techniques. 
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As used herein the term "coding region" when used in reference to structural gene 
refers to the nucleotide sequences that encode the amino acids found in the nascent 
polypeptide as a result of translation of a itlRNA molecule. The coding region is bounded, 
in eukaryotes, on the 5' side by the nucleotide triplet "ATG" that encodes the initiator 
5 methionine and on the 3 f side by one of the three triplets, which specify stop codons (i.e., 
TAA, TAG, TGA). 

As used herein the term "portion" when in reference to a protein (as in "a portion of 
a given protein") refers to fragments of that protein. The fragments may range in size from 
four consecutive amino acid residues to the entire amino acid sequence minus one amino 
10 acid. 

The term "gene of interest" as used herein refers to a foreign, heterologous, or 
autologous gene that is placed into an organism by introducing the gene into newly 
fertilized eggs or early embryos. The term "foreign gene" refers to any nucleic acid (e.g., 
gene sequence) that is introduced into the genome of an animal by experimental 

15 manipulations and may include gene sequences found in that animal so long as the 

introduced gene does not reside in the same location as does the naturally-occurring gene. 
The term "autologous gene" is intended to encompass variants (e.g., polymorphisms or 
mutants) of the naturally occurring gene. The term gene of interest thus encompasses the 
replacement of the naturally occurring gene with a variant form of the gene. 

20 As used herein, the term "vector" is used in reference to nucleic acid molecules that 

transfer DNA segment(s) from one cell to another. The term "vehicle" is sometimes used 
interchangeably with "vector." 

The term "expression vector" as used herein refers to a recombinant DNA molecule 
containing a desired coding sequence and appropriate nucleic acid sequences necessary for 

25 the expression of the operably linked coding sequence in a particular host organism. 

Nucleic acid sequences necessary for expression in prokaryotes usually include a promoter, 
an operator (optional), and a ribosome binding site, often along with other sequences. 
Eukaryotic cells are known to utilize promoters, enhancers, and termination and 
polyadenylation signals. 

30 As used herein, the term "host cell" refers to any eukaryotic or prokaryotic cell (e.g., 

bacterial cells such as E. coli, yeast cells, mammalian cells, avian cells, amphibian cells, 
plant cells, fish cells, and insect cells), whether located in vitro or in vivo. For example, host 
cells may be located in a transgenic animal. 
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The terms "overexpression" and "overexpressing" and grammatical equivalents, are 
used in reference to levels of mRNA to indicate a level of expression approximately 3-fold 
higher than that typically observed in a given tissue in a control or non-transgenic animal. 
Levels of mRNA are measured using any of a number of techniques known to those skilled 
5 in the art including, but not limited to Northern blot analysis (See, Example 10, for a 
protocol for performing Northern blot analysis). 

The term "transfection" as used herein refers to the introduction of foreign DNA into 
eukaryotic cells. Transfection may be accomplished by a variety of means known to the art 
including calcium phosphate-DNA co-precipitation, DEAE-dextran-mediated transfection, 

10 polybrene-mediated transfection, electroporation, microinjection, liposome fusion, 
lipofection, protoplast fusion, retroviral infection, and biolistics. 

The term "stable transfection ,, or "stably transfected" refers to the introduction and 
integration of foreign DNA into the genome of the transfected cell. The term "stable 
transfectant" refers to a cell that has stably integrated foreign DNA into the genomic DNA. 

1 5 The term "transient transfection" or "transiently transfected" refers to the 

introduction of foreign DNA into a cell where the foreign DNA fails to integrate into the 
genome of the transfected cell. The foreign DNA persists in the nucleus of the transfected 
cell for several days. During this time the foreign DNA is subject to the regulatory controls 
that govern the expression of endogenous genes in the chromosomes. The term "transient 

20 transfectant" refers to cells that have taken up foreign DNA but have failed to integrate this 
DNA. 

DETAILED DESCRIPTION OF THE INVENTION 

The present invention provides methods for obtaining site-specific recombination of 

25 a gene of interest in eukaryotic cells. The products of the recombinations performed using 

the methods of the present invention are stable. Thus, one can use the methods to, for 

example, introduce transgenes into chromosomes of eukaryotic cells and avoid the excision 

of the transgene that often occurs using previously known site-specific recombination 

systems. Stable inversions, translocations, and other rearrangements can also be obtained. 

30 The practice of the present invention employs, unless otherwise indicated, 

conventional techniques of organic chemistry, pharmacology, molecular biology (including 

recombinant techniques), cell biology, biochemistry, and immunology, which are within the 

skill of the art. Such techniques are explained fully in the literature, such as, "Molecular 

cloning: a laboratory manual' 5 Second Edition (Sambrook et al 9 1989); "Oligonucleotide 

10 
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synthesis" (MJ. Gait, ed., 1984); "Animal cell culture" (R.L Freshney, ed., 1987); the series 
"Methods in enzymology" (Academic Press, Inc.); "Handbook of experimental 
immunology" (D.M. Weir & C.C. Blackwell, eds.); "Gene transfer vectors for mammalian 
cells" (J.M. Miller & M.P. Calos, eds,, 19S7); "Current protocols in molecular biology" 
5 (F.M. Ausubel et al. 9 eds., 1987, and periodic updates); "PCR: the polymerase chain 

reaction" (Mullis et aL, eds., 1994); and "Current protocols in immunology" (J.E. Coligan et 
al, eds., 1991), each of which is herein incoiporated by reference in its entirety. 

I. Site Specific Recombination 

10 Many bacteriophage and integrative plasmids encode site- specific recombination 

systems that enable the stable incorporation of their genome into those of their hosts. In 
these systems, the minimal requirements for the recombination reaction are a recombinase 
enzyme, or integrase, which catalyzes the recombination event, and two recombination sites 
(Sadowski (1986) J. Bacterid. 165: 341-347; Sadowski (1993) FASEB J. 7: 760-767; each 

15 herein incorporated by reference in their entireties). For phage integration systems, these are 
referred to as attachment (att) sites, with an attP element from phage DNA and the attB 
element encoded by the bacterial genome. The two attachment sites can share as little 
sequence identity as a few base pairs. The recombinase protein binds to both att sites and 
catalyzes a conservative and reciprocal exchange of DNA strands that result in integration 

20 of the circular phage or plasmid DNA into host DNA. 

The methods of the present invention employ site-specific recombination systems to 
achieve stable integration or other rearrangement of nucleic acids in eukaryotic cells. 
Generally, a site-specific recombination system typically consists of three elements: two 
specific DNA sequences ("the recombination sites") and a specific enzyme ("the 

25 recombinase") (see, e.g., U.S. Patent No. 6,746,780; herein incorporated by reference in its 
entirety). The recombinase catalyzes a recombination reaction between the specific 
recombination sites. Integration of an expression vector containing one recombination site 
by recombination with a second recombination site in the genome of a host cell results in 
the entire integrated expression vector being flanked by two hybrid recombination sites. 

30 Recombination sites have an orientation. The orientation of the recombination sites 

in relation to each other determines what recombination event takes place. The 

recombination sites may be in two different orientations: parallel (same direction) or 

opposite. When the recombination sites are present on a single nucleic acid molecule and 

are in a parallel orientation to each other, then the recombination event catalyzed by the 
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recombinase is typically an excision of the intervening nucleic acid, leaving a single 
recombination site. When the recombination sites are in the opposite orientation, then any 
intervening sequence is typically inverted. 

The recombinases used in the methods of the present invention mediate site-specific 
5 recombination between a first recombination site and a second recombination site that can 
serve as a substrate for recombination with the first recombination site. However, in the 
absence of an additional factor that is not normally present in eukaryotic cells, eukaryotic 
cells cannot mediate recombination between two hybrid recombination sites that are formed 
upon recombination between the first recombination site and the second recombination site. 

10 Examples of such recombinases include, for example, the bacteriophage <|>C31 integrase 
(see, e.g., Thorpe & Smith (1998) Proc. Natl Acad. Sci. USA 95: 5505-5510; Kuhstoss & 
Rao (1991) J. Mol. Biol. 222; 897-890; U.S. Pat. No. 5,190,871), a phage P4 recombinase 
(Ow & Ausubel (1983) J. Bacteriol. 155: 704-713), a Listeria phage recombinase, a 
bacteriophage R4 Sre recombinase (Matsuura et al. (1996) J. Bacteriol. 178: 3374-3376), a 

15 CisA recombinase (Sato et al. (1990) J. Bacteriol. 172: 1092-1098; Stragier et al. (1989) 
Science 243: 507-512), an XisF recombinase (Carrasco et al. (1994) Genes Dev. S: 74-83), 
and a transposon Tn4451 TnpX recombinase (Bannam et al. (1995) Mol. Microbiol. 16: 
535-551; Crelin & Rood (1997) J. Bacteriol. 179: 5148-5156; each herein incorporated in 
their entireties). 

20 Recombinase polypeptides, and nucleic acids that encode the recombinase 

polypeptides, are described in the art and can be obtained using routine methods. For 
example, a vector that includes a nucleic acid fragment that encodes the (|)C31 integrase is 
described in U.S. Pat. No. 5,190,871 (additionally, see, e.g., Andreas, et al., Nucleic Acids 
Research, 30, 1 1, 2299-2306 (2002); Ortiz-Urda, et al, Nature Medicine, 8, 10, 1 166-1 170 

25 (2002); Groth, et al, PNAS, 97(1 1), 5995-6000 (2000); Olivares, et al., Nature Biotech., 20, 
1124-1 128 (2002); Thoipe, et al, PNAS 95, 5505-5510 (1998); Baer, et al., Current Opin. 
Biotech., 12, 473-480 (2001); each herein incorporated by reference in their entireties). 

The recombinases can be introduced into the eukaryotic cells that contain the 
recombination sites at which recombination is desired by any suitable method. For example, 

30 one can introduce the recombinase in polypeptide form, e.g., by microinjection or other 
methods. In presently preferred embodiments, however, a gene that encodes the 
recombinase is introduced into the cells. Expression of the gene results in production of the 
recombinase, which then catalyzes recombination among the corresponding recombination 
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sites. One can introduce the recombinase gene into the cell before, after, or simultaneously 
with, the introduction of the exogenous polynucleotide of interest. In one embodiment, the 
recombinase gene is present on a separate vector and the vector encoding the recombinase 
and the vector encoding the gene of interest are cotransfected. In other embodiments, the 
5 recombinase gene is introduced into a transgenic eukaryotic organism, e.g., a transgenic 
plant, animal, fungus, or the like, which is then crossed with an organism that contains the 
corresponding recombination sites. 

hi some embodiments, the present invention employs prokaryotic recombinases, 
such as bacteriophage integrases, that are unidirectional in that they can catalyze 

1 0 recombination between two complementary recombination sites, but cannot catalyze 

recombination between the hybrid sites that are formed by this recombination. One such 
recombinase, the <|>C3 1 integrase, by itself catalyzes only an attB x attP reaction. The 
integrase cannot mediate recombination between the attL and attR sites that are formed 
upon recombination between attB and attP. Because recombinases such as the <))C31 

1 5 integrase cannot alone catalyze the reverse reaction, the <j)C3 1 attB x attP recombination is 
stable. This property is one that sets the methods of the present invention apart from site- 
specific recombination systems currently in use for eukaryotic cells, such as the Cre-lox or 
FLP-FRT system, where the recombination reactions can readily reverse. Use of the 
recombination systems of the invention provides new opportunities for directing stable 

20 transgene and chromosome rearrangements in eukaryotic cells. 

The methods of the present invention involve contacting a pair of recombination 
sites (e.g., attB and attP) that are present in a eukaryotic cell with a corresponding 
recombinase. The recombinase then mediates recombination between the recombination 
sites. Depending upon the relative locations of the two recombination sites, any one of a 

25 number of events can occur as a result of the recombination. For example, if the two 

recombination sites are present on different nucleic acid molecules, the recombination can 
result in integration of one nucleic acid molecule into a second molecule. Thus, one can 
obtain integration of a plasmid that contains one recombination site into a eukaryotic cell 
chromosome that includes the corresponding recombination site. Because the recombinases 

30 used in the methods of the invention cannot catalyze the reverse reaction, the integration is 
stable. Such methods are useful, for example, for obtaining stable integration into the 
eukaryotic chromosome of a gene of interest that is present on the plasmid. 

13 
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As discussed in more detail below, preferred embodiments of the present invention 
provide vectors containing site-specific recombinases (e.g., <DC31 integrase) for triggering 
recombination between a recombination site in the expression vector (e.g., attB) and a 
different recombination site (e.g., attP site or pseudo-attP site in the case of human cells) 
5 into the chromosome of a host cell (e.g., human chromosome 8). Such vectors permit the 
generation of recombinants prepared by integrating a foreign gene into a host chromosome 
of a host cell. 

II. Insulator Elements 

10 A difficulty encountered with the introduction of transgenic constructs into 

vertebrate species involves clonal variation in the expression of the same gene in 
independent transformants. This problem is referred to as "position effect" variation and is 
thought to relate to the effects of DNA sequences adjacent to the site where the gene was 
inserted in the genome. In addition to variation of expression, sometimes the expression of 

15 the gene product is not regulated in accordance with the promoter used to express the gene. 
For example, expression of the gene product may occur in non-targeted tissues even though 
a tissue specific promoter was used to express the gene product. No completely satisfactory 
method of obviating these problems has yet been developed, and thus there is a continued 
need for a solution. 

20 Problems relating to the controlled expression of introduced genes arise because the 

introduced gene may be inserted adjacent to regulatory elements normally present in the 
genome. For example, it is known that enhancer elements can significantly increase the 
expression of adjacent genes. Thus if an introduced gene construct was inserted next to a 
strong enhancer element, the regulatory control of a tissue-specific promoter may be 

25 overridden by the enhancer, thus resulting in expression of the gene in non-targeted tissues. 
Undesired effects on transgene expression are also encountered when transgenes integrate 
into poorly-expressed regions of the host cell genome (heterochromatin). Accordingly, to 
increase the predictability and safety of expressing foreign genes in vertebrate species a 
method must be provided that minimizes these "position effects." The present invention 

30 provides insulators (e.g., HS2, HS3, HS4) that block enhancer activity and other regulatory 
effects, thus allowing for a more predictable expression pattern for introduced gene 
constructs. Insulator elements also have the ability to minimize the negative influence of 
adjacent heterochromatic regions on transgene expression. 

14 
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Insulators are nucleic acid sequences that function to block enhancer effects on 
genes, and therefore insulators can be used to block position effects and allow for better 
regulation of transfected genes. Insulator elements have been described in several 
nonvertebrate organisms (see, e.g., U.S. Patent Publication Nos. 2003/021 158 1A1 and 
2003/0022303A1, and Geyer, et al., Cell Mol. Life Sci. (2002) 21 12-2127; Taboit- 
Dameron, et al., Transgenic Research 8 (1999) 223-235; Chung, et al., Cell 74 (1993) 505- 
514; Szabo, et al, Development 129 (2002) 897, 904; Chung, et ah, PNAS 94 (1997) 575- 
580; Recillas-Targa, et al., PNAS 99 (2002) 6883-6888; Emery, et al., PNAS 97 (2000) 
9150-9155). However, the use of insulator elements in conjunction with site specific 
recombination elements has yet to be described. 

As discussed in more detail below, preferred embodiments of the present invention 
provide cloning vectors containing insulator elements (e.g., HS4) for preventing position 
effect variation. Other preferred embodiments of the present invention provide expression 
vectors containing site-specific recombination sites and insulators in a configuration such 
that, after site-specific integration, the integrated expression vector is flanked by insulator 
elements. 

III. Host Cells 

Generally, the present invention is not limited to the use of any particular type of 

cell or cell line. For example, a number of host cell lines are known in the art and find use 

in the present invention. In general, these host cells are capable of growth and survival 

when placed in either monolayer culture or in suspension culture in a medium containing 

the appropriate nutrients and growth factors, as is described in more detail below. 

Typically, the cells are capable of expressing and secreting large quantities of a particular 

protein of interest into the culture medium. Examples of suitable mammalian host cells 

include, but are not limited to Chinese hamster ovary cells (CHO-K1, ATCC CC1-61); 

bovine mammary epithelial cells (ATCC CRL 10274; bovine mammary epithelial cells); 

monkey kidney CV1 line transformed by SV40 (COS-7, ATCC CRL 1651); human 

embryonic kidney line (293 or 293 cells subcloned for growth in suspension culture; see, 

e.g., Graham et al, J. Gen Virol., 36:59 [1977]); baby hamster kidney cells (BHK, ATCC 

CCL 10); mouse Sertoli cells (TM4, Mather, Biol. Reprod. 23:243-251 [1980]); monkey 

kidney cells (CV1 ATCC CCL 70); African green monkey kidney cells (VERO-76, ATCC 

CRL-1587); human cervical carcinoma cells (HELA, ATCC CCL 2); canine kidney cells 

(MDCK, ATCC CCL 34); buffalo rat liver cells (BRL 3A, ATCC CRL 1442); human lung 

15 
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cells (W138, ATCC CCL 75); human liver cells (Hep G2, HB 8065); mouse mammary 
tumor (MMT 060562, ATCC CCL51); TRI cells (Mather et aL, Annals N.Y. Acad. Sci., 
383:44-68 [1982]); MRC 5 cells; FS4 cells; rat fibroblasts (208F cells); MDBK cells 
(bovine kidney cells); and a human hepatoma line (Hep G2). The present invention also 
5 contemplates the use of amphibian and insect host cell lines. Examples of suitable insect 
host cell lines include, but are not limited to, mosquito cell lines (e.g., ATCC CRL-1660). 
Examples of suitable amphibian host cell lines include, but are not limited to, toad cell lines 
(e.g., ATCC CCL-102). 

In some preferred embodiments, the cells are cells from an epidermal cell lineage 

10 such as keratinocytes. The present invention is not limited to the use of any particular 
source of cells that are capable of differentiating into squamous epithelia. Indeed, the 
present invention contemplates the use of a variety of cell lines and sources that can 
differentiate into squamous epithelia, including both primary and immortalized 
keratinocytes. Sources of cells include keratinocytes and dermal fibroblasts biopsied from 

15 humans and cavaderic donors (Auger et aL, In Vitro Cell. Dev. Biol. - Animal 36:96-103; 
U.S. Pat. Nos. 5,968,546 and 5,693,332, each of which is incorporated herein by reference), 
neonatal foreskins (Asbill et aL, Pharm. Research 17(9): 1092-97 (2000); Meana et aL, 
Bums 24:621-30 (1998); U.S. Pat. Nos. 4,485,096; 6,039,760; and 5,536,656, each of which 
is incorporated herein by reference), and immortalized keratinocytes cell lines such as NM1 

20 cells (Baden, In Vitro Cell. Dev. Biol. 23(3):205-213 (1987)), HaCaT cells (Boucamp et aL, 
J. cell. Boil. 106:761-771 (1988)); and NEKS cells (Cell line BC-l-Ep/SL; U.S. Pat. No. 
5,989,837, incorporated herein by reference; ATCC CRL-12191). 

In particularly preferred embodiments, NIKS cells are utilized. NIKS cells are 
thoroughly described in U.S. Provisional Patent Application Serial No. 60/493,664, and 

25 U.S. Patent Nos. 6,514,71 1, 6,495,135, 6,485,724, 6,214,567, and 5,989,837; each herein 
incorporated by reference in their entireties. The discovery of a novel human keratinocyte 
cell line (near-diploid immortalized keratinocytes or NIKS) provides an opportunity to 
genetically engineer human keratinocytes for new therapeutic methods. A unique 
advantage of the NIKS cells is that they are a consistent source of genetically-uniform, 

30 pathogen-free human keratinocytes. For this reason, they are useful for the application of 
genetic engineering and genomic gene expression approaches to provide skin equivalent 
cultures with properties more similar to human skin. Such systems will provide an 
important alternative to the use of animals for testing compounds and formulations. The 
NIKS keratinocyte cell line, identified and characterized at the University of Wisconsin, is 



WO 2006/055931 PCT/US2005/042219 

nontumorigenic, exhibits a stable karyotype, and undergoes normal differentiation both in 
monolayer and organotypic culture. NIKS cells form fully stratified skin equivalents in 
culture. These cultures are indistinguishable by all criteria tested thus far from organotypic 
cultures formed from primary human keratinocytes. Unlike primary cells however, the 
5 immortalized NIKS cells will continue to proliferate in monolayer culture indefinitely. This 
provides an opportunity to genetically manipulate the cells and isolate new clones of cells 
with new useful properties (Allen-Hoffmann et al, J. Invest. Dermatol, 1 14(3): 444-455 
(2000)). 

The NIKS cells arose from the BC-l-Ep strain of human neonatal foreskin 

10 keratinocytes isolated from an apparently normal male infant. In early passages, the BC-1- 
Ep cells exhibited no morphological or growth characteristics that were atypical for cultured 
normal human keratinocytes. Cultivated BC-l-Ep cells exhibited stratification as well as 
features of programmed cell death. To determine replicative lifespan, the BC-l-Ep cells 
were serially cultivated to senescence in standard keratinocyte growth medium at a density 

15 of3x 10 5 cells per 100-mm dish and passaged at weekly intervals (approximately a 1 :25 

split). By passage 15, most keratinocytes in the population appeared senescent as judged by 
the presence of numerous abortive colonies which exhibited large, flat cells. However, at 
passage 16, keratinocytes exhibiting a small cell size were evident. By passage 17, only the 
small-sized keratinocytes were present in the culture and no large, senescent keratinocytes 

20 were evident. The resulting population of small keratinocytes that survived this putative 
crisis period appeared morphologically uniform and produced colonies of keratinocytes 
exhibiting typical keratinocyte characteristics including cell-cell adhesion and apparent 
squame production. The keratinocytes that survived senescence were serially cultivated at a 
density of 3 x 10 5 cells per 100-mm dish. Typically the cultures reached a cell density of 

25 approximately 8 x 10 6 cells within 7 days. This stable rate of cell growth was maintained 
through at least 59 passages, demonstrating that the cells had achieved immortality. The 
keratinocytes that emerged from the original senescencing population were originally 
designated BC-l-Ep/Spontaneous Line and are now termed NIKS. The NIKS cell line has 
been screened for the presence of proviral DNA sequences for HIV-1, HIV-2, EBV, CMV, 

30 HTLV-1, HTLV-2, HBV, HCV, B-19 parvovirus, HPV-16 and HPV-31 using either PCR 
or Southern analysis. None of these viruses were detected. 

Chromosomal analysis was performed on the parental BC-l-Ep cells at passage 3 
and NIKS cells at passages 31 and 54. The parental BC-l-Ep cells have a normal 
chromosomal complement of 46, XY. At passage 3 1, all NIKS cells contained 47 
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chromosomes with an extra isochromosome of the long arm of chromosome 8. No other 
gross chromosomal abnormalities or marker chromosomes were detected. At passage 54, 
all cells contained the isochromosome 8. 

The DNA fingerprints for the NIKS cell line and the BC-l-Ep keratinocytes are 
5 identical at all twelve loci analyzed demonstrating that the NIKS cells arose from the 

parental BC-l-Ep population. The odds of the NIKS cell line having the parental BC-l-Ep 
DNA fingerprint by random chance is 4 x 10" 16 . The DNA fingerprints from three different 
sources of human keratinocytes, ED-l-Ep, SCC4 and SCC13y are different from the BC-l- 
Ep pattern. This data also shows that keratinocytes isolated from other humans, ED-l-Ep, 

10 SCC4, and SCC13y, are unrelated to the BC-l-Ep cells or each other. The NIKS DNA 
fingerprint data provides an unequivocal way to identify the NIKS cell line. 

Loss of p53 function is associated with an enhanced proliferative potential and 
increased frequency of immortality in cultured cells. The sequence of p53 in the NIKS cells 
is identical to published p53 sequences (GenBank accession number: Ml 4695). In humans, 

15 p53 exists in two predominant polymorphic forms distinguished by the amino acid at codon 
72. Both alleles of p53 in the NIKS cells are wild-type and have the sequence CGC at 
codon 72, which codes for an arginine. The other common form of p53 has a proline at this 
position. The entire sequence of p53 in the NIKS cells is identical to the BC-l-Ep 
progenitor cells. Rb was also found to be wild-type in NIKS cells. 

20 Anchorage-independent growth is highly correlated to tumorigenicity in vivo. For 

this reason, the anchorage-independent growth characteristics of NIKS cells in agar or 
methylcellulose-containing medium was investigated. After 4 weeks in either agar- or 
methylcellulose-containing medium, NIKS cells remained as single cells. The assays were 
continued for a total of 8 weeks to detect slow growing variants of the NIKS cells. None 

25 were observed. 

To determine the tumorigenicity of the parental BC-l-Ep keratinocytes and the 
immortal NIKS keratinocyte cell line, cells were injected into the flanks of athymic nude 
mice. The human squamous cell carcinoma cell line, SCC4, was used as a positive control 
for tumor production in these animals. The injection of samples was designed such that 

30 animals received SCC4 cells in one flank and either the parental BC-l-Ep keratinocytes or 
the NIKS cells in the opposite flank. This injection strategy eliminated animal to animal 
variation in tumor production and confirmed that the mice would support vigorous growth 
of tumorigenic cells. Neither the parental BC-l-Ep keratinocytes (passage 6) nor the NIKS 
keratinocytes (passage 35) produced tumors in athymic nude mice. 



WO 2006/055931 



PCT/US2005/042219 



NIKS cells were analyzed for the ability to undergo differentiation in both surface 
culture and organotypic culture. For cells in surface culture, a marker of squamous 
differentiation, the formation cornified envelopes was monitored. In cultured human 
keratinocytes, early stages of cornified envelope assembly result in the formation of an 
5 immature structure composed of involucrin, cystatin-oc and other proteins, which represent 
the innermost third of the mature cornified envelope. Less than 2% of the keratinocytes 
from the adherent BC-l-Ep cells or the NIKS cell line produce cornified envelopes. This 
finding is consistent with previous studies demonstrating that actively growing, 
subconfluent keratinocytes produce less than 5% cornified envelopes. To determine 

10 whether the NIKS cell line is capable of producing cornified envelopes when induced to 
differentiate, the cells were removed from surface culture and suspended for 24 hours in 
medium made semi-solid with methylcellulose. Many aspects of terminal differentiation, 
including differential expression of keratins and cornified envelope formation can be 
triggered in vitro by loss of keratinocyte cell-cell and cell-substratum adhesion. The NIKS 

15 keratinocytes produced as many as and usually more cornified envelopes than the parental 
keratinocytes. These findings demonstrate that the NIKS keratinocytes are not defective in 
their ability to initiate the formation of this cell type-specific differentiation structure. 

To confirm that the NIKS keratinocytes can undergo squamous differentiation, the 
cells were cultivated in organotypic culture. Keratinocyte cultures grown on plastic 

20 substrata and submerged in medium replicate but exhibit limited differentiation. 

Specifically, human keratinocytes become confluent and undergo limited stratification 
producing a sheet consisting of 3 or more layers of keratinocytes. By light and electron 
microscopy there are striking differences between the architecture of the multilayered sheets 
formed in tissue culture and intact human skin, hi contrast, organotypic culturing 

25 techniques allow for keratinocyte growth and differentiation under in vzvo-like conditions. 
Specifically, the cells adhere to a physiological substratum consisting of dermal fibroblasts 
embedded within a fibrillar collagen base. The organotypic culture is maintained at the air- 
medium interface, hi this way, cells in the upper sheets are air-exposed while the 
proliferating basal cells remain closest to the gradient of nutrients provided by diffusion 

30 through the collagen gel. Under these conditions, correct tissue architecture is formed. 

Several characteristics of a normal differentiating epidermis are evident. In both the 

parental cells and the NIKS cell line a single layer of cuboidal basal cells rests at the 

junction of the epidermis and the dermal equivalent. The rounded morphology and high 

nuclear to cytoplasmic ratio is indicative of an actively dividing population of keratinocytes. 

19 



WO 2006/055931 



PCT/US2005/042219 



In normal human epidermis, as the basal cells divide they give rise to daughter cells that 
migrate upwards into the differentiating layers of the tissue. The daughter cells increase in 
size and become flattened and squamous. Eventually these cells enucleate and form 
cornified, keratinized structures. This normal differentiation process is evident in the upper 
5 layers of both the parental cells and the NIKS cells. The appearance of flattened squamous 
cells is evident in the upper layers of keratinocytes and demonstrates that stratification has 
occurred in the organotypic cultures. In the uppermost part of the organotypic cultures the 
enucleated squames peel off the top of the culture. To date, no histological differences in 
differentiation at the light microscope level between the parental keratinocytes and the 

10 NIKS keratinocyte cell line grown in organotypic culture have been observed 

To observe more detailed characteristics of the parental (passage 5) and NIKS 
(passage 38) organotypic cultures and to confirm the histological observations, samples 
were analyzed using electron microscopy. Parental cells and the immortalized human 
keratinocyte cell line, NIKS, were harvested after 15 days in organotypic culture and 

15 sectioned perpendicular to the basal layer to show the extent of stratification. Both the 

parental cells and the NIKS cell line undergo extensive stratification in organotypic culture 
and form structures that are characteristic of normal human epidermis. Abundant 
desmosomes are formed in organotypic cultures of parental cells and the NIKS cell line. 
The formation of a basal lamina and associated hemidesmosomes in the basal keratinocyte 

20 layers of both the parental cells and the cell line was also noted. 

Hemidesmosomes are specialized structures that increase adhesion of the 
keratinocytes to the basal lamina and help maintain the integrity and strength of the tissue. 
The presence of these structures was especially evident in areas where the parental cells or 
the NIKS cells had attached directly to the porous support. These findings are consistent 

25 with earlier ultrastmctural findings using human foreskin keratinocytes cultured on a 

fibroblast-containing porous support. Analysis at both the light and electron microscopic 
levels demonstrate that the NIKS cell line in organotypic culture can stratify, differentiate, 
and form structures such as desmosomes, basal lamina, and hemidesmosomes found in 
normal human epidermis. 

30 The present invention contemplates methods and compositions for making cells (e.g. 

NIKS cells) that express an antimicrobial polypeptide. NIKS cell transformation 

procedures suitable for use herein are those known in the art and include, for example with 

mammalian cell systems, dextran-mediated transfection, calcium phosphate precipitation, 

polybrene-mediated transfection, protoplast fusion, electroporation, encapsulation of the 

20 
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antimicrobial polypeptide polynucleotide in liposomes, and direct microinjection of the 
DNA into nuclei. 

IV. Promoters 

5 In preferred embodiments, the expression vectors of the present invention comprise 

a promoter which can be operably linked to a gene of interest. Promoters useful in the 
present invention include, but are not limited to, the LTR or SV40 promoter, the E. coli lac 
or tip, the phage lambda P L and P R , T3 and T7 promoters, and the cytomegalovirus (CMV) 
immediate early, herpes simplex virus (HSV) thymidine kinase, and mouse metallothionein- 

10 I promoters and other promoters known to control expression of gene in prokaryotic or 
eukaryotic cells or their viruses. 

In other embodiments, any promoter that would allow expression of the gene of 
interest in an epidermal cell host can be used in the present invention. Examples of 
promoters useful in the present invention include, but are not limited to, K14, K5, and 

1 5 hi volucrin promoters . 

In preferred embodiments, the human involucrin promoter is used to drive 
expression of the gene of interest. In such embodiments, a gene of interest is operably 
linked to the involucrin promoter and transfected into epidermal host cells (e.g., NIKS cells) 
in an expression vector. 

20 hi other preferred embodiments, the K14 promoter is used to drive expression of a 

gene of interest. In such embodiments, a gene of interest is operably linked to the K14 
promoter and transfected into epidermal host cells (e.g., NIKS cells) in an expression 
vector, hi some embodiments, the K14 promoter is isolated from a DNA source, cloned, 
sequenced, and shuttled into a selection vector. In further embodiments, isolation of the 

25 K14 promoter DNA occurs via PCR with K14 primer sequences. Primer sequences specific 
for K14 Promoter can be obtained from Genbank. Amplification of a DNA source with 
such primer sequences through standard PCR procedures results in the isolation of K14 
Promoter DNA. 

A wide variety of genes of interest may be linked to the promoter. Such genes 
30 include, but are not limited to, those encoding KGF-2, Defensins 1, 2 or 3, Cathelocidins, 
VEGF, HIFla, Ots A and Ots B (see, e.g., International Patent Application No. 
PCT/US02/06088). 

Optionally, other regulatory sequences can be used herein, such as one or more of an 

enhancer sequence, an intron with functional splice donor and acceptance sites, a signal 
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sequence for directing secretion of the gene of interest, a polyadenylation sequence, other 
transcription terminator sequences, and a sequence homologous to the host cell genome. 

Further, a selectable marker can be present in the expression vector for selection of 
the presence thereof in the transformed host cells. Selectable markers are genes that encode 
5 an enzymatic activity that confers the ability to grow in medium lacking what would 
otherwise be an essential nutrient (e.g. the HIS3 gene in yeast cells); in addition, a 
selectable marker may confer resistance to an antibiotic or drug upon the cell in which the 
selectable marker is expressed. Selectable markers may be "dominant"; a dominant 
selectable marker encodes an enzymatic activity that can be detected in any eukaryotic cell 

10 line. Examples of dominant selectable markers include the bacterial aminoglycoside 3' 

phosphotransferase gene (also referred to as the neo gene) that confers resistance to the drug 
G418 in mammalian cells, the bacterial hygromycin G phosphotransferase (hyg) gene that 
confers resistance to the antibiotic hygromycin and the bacterial xanthine-guanine 
phosphoribosyl transferase gene (also referred to as the gpt gene) that confers the ability to 

15 grow in the presence of mycophenolic acid. Other selectable markers are not dominant in 
that their use must be in conjunction with a cell line that lacks the relevant enzyme activity. 
Examples of non-dominant selectable markers include the thymidine kinase (tk) gene that is 
used in conjunction with tk" cell lines, the CAD gene, which is used in conjunction with 
CAD-deficient cells, and the mammalian hypoxanthine-guanine phosphoribosyl transferase 

20 (hprt) gene, which is used in conjunction with hprt" cell lines. A review of the use of 

selectable markers in mammalian cell lines is provided in Sambrook, J. et aL, Molecular 
Cloning; A Laboratory Manual, 2nd ed., Cold Spring Harbor Laboratory Press, New York 
(1989) pp.16.9-16.15. 

25 V. Constructs For Introduction Of Exogenous DNA Into Target Cells 

The present invention contemplates cells expressing a gene of interest (e.g., KGF-2, 

VEGF) and compositions and methods for making cells expressing a gene of interest. The 

consensus sequence for VEGF is provided at Figure 4. The present invention is not limited 

to a particular gene of interest. In preferred embodiments, cells are induced to express a 

30 gene of interest through transfection with an expression vector containing the gene of 

interest DNA. An expression vector containing the gene of interest DNA can be produced 

by operably linking the gene of interest DNA to one or more regulatory sequences (e.g., 

K14 promoter) such that the resulting vector is operable in a desired host (e.g., NIKS cells). 

In preferred embodiments, the regulatory sequence is an epidermal cell promoter regulatory 
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sequence (e.g., K-14, K-5, K-6). The present invention is not limited to a particular 
promoter. 

In certain embodiments, the present invention accomplishes transgene expression in 
eukaryotic cells through site-specific recombination. As such, the expression vectors are 
5 provided containing nucleic acid comprising site-specific recombination elements (e.g., 
attB), insulator elements (e.g., HS4), promoter sequences (e.g., K14) and a gene of interest 
sequence (e.g., KGF-2). In preferred embodiments, the components within an expression 
vector are provided in the following 5' to 3' arrangement: recombination element (e.g., 
attB), insulator element (e.g., HS4), epidemial cell-specific promoter sequence (e.g., K14), 

10 gene of interest sequence (e.g., KGF-2), insulator element. 

In preferred embodiments, transfection of such expression vectors into eukaryotic 
cells (e.g., NIKS cells) results in the introduction of the gene of interest into chromosomes 
of eukaryotic cells and avoids the excision of the transgene that often occurs using 
previously known site-specific recombination systems. Figure 3 presents an expression 

15 vector with an attB site specific recombination site, HS4 insulator elements and a K14 
promoter. In further preferred embodiments, a second expression vector comprising a 
promoter operably linked to a recombinase sequence (e.g., (j)C-31 integrase) is co- 
transfected with the initial expression vector. 

20 EXAMPLES 

Example 1 : Isolation of HS4 Insulator Element, attB Integration site, and C31 
Integrase 

A DNA fragment containing the 250 bp "core" of the HS4 insulator element was 
25 amplified by PCR from chicken genomic DNA using primers designed to published HS4 
sequences (see, e.g., Chung, J.H., et al., Proc Natl Acad Sci USA, 1997, 94(2): 575-80). 
This DNA fragment was cloned into the pCR2.1 vector and sequenced to verify its identity 
and integrity. The HS4 core element was identical to previously published sequences. The 
HS4 core element was excised from the pCR2.1 vector with EcoRV and multimerized by 
30 ligating the 250 bp HS4 monomer overnight and gel-purifying and cloning 500 bp DNA 
fragments corresponding to HS4 dimers. Plasmids containing HS4 dimers in directly 
repeated orientation were identified by restriction analysis and DNA sequencing. 
Dimerized HS4 core elements were termed 2XHS4. 
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A 285 bp DNA fragment containing the attB integration target sequence was 
isolated from S. lividans genomic DNA by PCR using primers to published attB sequences 
(see, e.g., Rausch, H. and M. Lehmann, Nucleic Acids Res, 1991, 19(19): 5187-9). A 
minimal attB element was also assembled by annealing complementary oligonucleotides to 
5 generate a 53 bp double stranded DNA product that contains the minimal attB element (see, 
e.g., Groth, A.C., et al., Proc Natl Acad Sci USA, 2000, 97(1 1): 5995-6000). These DNA 
fragments were cloned into the pCR2.1 vector and sequenced to verify their identity and 
integrity. 

A DNA fragment containing the coding region for C3 1 integrase was amplified by 
10 PCR from the phage OC31. The primers used for amplification were designed such that the 
C3 1 integrase coding region was preceded by a Kozak consensus translation initiation site 
and a nuclear localization sequence was introduced immediately downstream of the C31 
integrase coding region. This DNA fragment was cloned into the pCR2.1 vector and 
sequenced to verify its identity and integrity. The C3 1 integrase coding region was then 
1 5 cloned into an expression vector such that C3 1 integrase expression is driven by the human 
K14 promoter. 

Example 2: Assembly of Insulator/Targeted Integration Cassettes 

Cassettes containing the attB integration element flanked by dimerized HS4 
20 elements were assembled in a two-step cloning strategy. First, the 2XHS4 element was 
cloned into the NotI site of pCR2.1 vectors containing either the 53 bp minimal attB 
element or the 285 bp attB element described in Example 1 . After screening for insert 
orientation by restriction analysis and DNA sequencing, a second copy of the 2XHS4 
element was cloned into the BamHI site of plasmids containing one copy of the 2XHS4 
25 element and either the 53 bp or 285 bp attB elements. Clones that contained copies of the 
2XHS4 element in direct orientation were identified by restriction analysis and confirmed 
by DNA sequencing. 

The insulator/attB cassettes were cloned into expression vectors containing the 
coding region for VEGF\ 65 . 

30 

Example 3: Isolation of Stably-Transfected Cells 

Stable cell clones with randomly-integrated expression constructs were isolated by 

transfecting NIKS keratinocytes with a K14-VEGF expression vector lacking the 

insulator/attB cassette. Stable cell clones that integrated via the attB element were isolated 
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by co-transfecting NIKS keratinocytes with a K14-VEGF expression vector containing the 
insulator/attB cassette and an expression vector containing the C3 1 integrase under control 
of the K14 promoter. In both cases, trans fected cells were grown in the presence of 
blasticidin for three weeks to select for clones that had stably incorporated the K14-VEGF 
5 expression constructs. Independent clones were isolated, expanded, and stored in liquid 
nitrogen as glycerol stocks. 

Example 4: VEGF Expression in Stably-Transfected Clones 

Independent clones containing or lacking the insulator/targeted integration cassette 
10 were grown in monolayer culture and also in organotypic culture to produce skin tissue. 

Conditioned medium from cells or tissue was collected and analyzed by ELIS A to quantify 
VEGF content. On average, the level of VEGF secreted from cells or tissue prepared from 
clones with the insulator/targeted integration cassette was 6-fold higher than that from 
clones lacking insulators and isolated by random integration. The data are presented in 
15 Table I. 

Table I: Increased Transgene Expression from Constructs Containing Insulators and 



Targeted-integration Sequences 



Clone 


Construct 


VEGF 
expression (fold over 
endogenous 


Average 


43:13 

4D1 


K 14- VEGF 


1.6X 




33:65 

A1B 


K 14- VEGF 


1.5X 




33:12 

C9B 


K 14- VEGF 


1.3X 


1.5X 










45:39B 


Kl 4- VEGF/attB- 

HS4 


10X 




45:53 

C4B 


K 1 4- VEGF/attB- 

HS4 


4X 




55:54 Al 


K 1 4- VEGF/attB- 

HS4 


6.5X 


6.8X 



20 All publications and patents mentioned in the above specification are herein 

incorporated by reference. Although the invention has been described in connection with 
specific preferred embodiments, it should be understood that the invention as claimed 
should not be unduly limited to such specific embodiments. Indeed, various modifications 
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of the described modes for carrying out the invention that are obvious to those skilled in the 
relevant fields are intended to be within the scope of the following claims. 
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CLAIMS 



5 What is claimed is: 



1. An expression vector comprising one or more promoters operably linked to one or 
more genes of interest and a site-specific recombination element flanked by at least 
one insulator element on both the 5' and 3' sides of said recombination element, 
10 wherein upon recombination with a chromosome said expression vector produces an 

insert with the following 5' to 3' arrangement: site-specific recombination remnant, 
insulator element, genes of interest operably linked to promoters, insulator element, 
and site-specific recombination recombination remnant. 



15 2. The expression vector of Claim 1 , wherein said promoter is an epidermal cell 
specific promoter. 

3. The expression vector of Claim 1, wherein said epidermal-specific promoter is 
selected from the group consisting of the K14 promoter and the involucrin promoter. 

20 

4. The expression vector of Claim 1, wherein said gene of interest is selected from the 
group consisting of VEGF and KGF-2. 

5. The expression vector of Claim 1, wherein said site specific recombination element 
25 is selected from the group consisting of attB, attP, attL, and attR. 

6. The expression vector of Claim 1, wherein said insulator element is HS-4. 

7. The expression vector of Claim 6, wherein said HS-4 is a HS-4 dimer. 

30 

8. A system for introducing a gene into the genome of a host cell comprising: 1) the 
expression vector of Claim 1 ; and 2) an expression vector comprising a promoter 
operably linked to a recombinase. 
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9. The system of Claim 8, wherein said recombinase is selected from the group 
consisting of a bacteriophage (j>C3 1 integrase, a coliphage P4 recombinase, a Listeria 
phage recombinase, a bacteriophage R4 Sre recombinase, a CisA recombinase, an 
XisF recombinase, and a transposon Tn445 1 TnpX recombinase. 

5 

10. The system of Claim 8, wherein said recombinase is a <J)C31 integrase. 

11. A kit for generating a recombinant expression vector for integrating a gene of 
interest DNA sequence, comprising: 

10 a) the expression vector of Claim 1 ; 

b) a second expression vector comprising a promoter operably linked to a 
recombinase; and 

c) instructions for integrating a gene of interest in cells. 

15 12. The kit of Claim 1 1 , wherein said promoter sequence is selected from the group 
consisting of K14 and involucrin. 

13. The kit of Claim 11, wherein said at least one site-specific recombination site is 
selected from the group consisting of attB, attP, attL, and attR. 

20 

14. The kit of Claim 1 1 , wherein said at least one insulator group is HS4. 

15. The expression vector of Claim 14, wherein said HS-4 is a HS-4 dimer. 

25 16. The kit of Claim 11, wherein said recombinase is selected from the group consisting 
of a bacteriophage (|>C3 1 integrase, a coliphage P4 recombinase, a Listeria phage 
recombinase, a bacteriophage R4 Sre recombinase, a CisA recombinase, an XisF 
recombinase, and a transposon Tn445 1 TnpX recombinase. 

30 17. The kit of Claim 1 1 , wherein said recombinase is a (j)C3 1 integrase. 
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18. A method of expressing a gene of interest in a host cell, comprising: 

a) providing: 

i) a first expression vector comprising a promoter operably linked to 
said gene of interest and a site-specific recombination element 
flanked by at least one insulator element on both the 5 5 and 3' sides 
of said recombination element; 

ii) a second expression vector comprising a promoter operably linked to 
a recombinase; and 

hi) host cells; 

b) introducing said first and second expression vectors into said host cells under 
conditions such that recombinase protein produced from said second 
expression vector causes integration of said first expression vector via said 
recombination element, 

19. The method of Claim 18, wherein said promoter is selected from the group 
consisting of the K14 promoter and the involucrin promoter. 

20. The method of Claim 18, wherein said gene of interest is selected from the group 
consisting of VEGF and KGF-2. 

21. The method of Claim 18, wherein said site specific recombination element is 
selected from the group consisting of attB, attP, attL, and attR. 

22. The method of Claim 18, wherein said insulator element is HS-4. 

23. The method of Claim 22, wherein said HS-4 is a HS-4 dimer. 

24. The method of Claim 18, wherein said recombinase is selected from the group 
consisting of a bacteriophage <\>C3 1 integrase, a coliphage P4 recombinase, a Listeria 
phage recombinase, a bacteriophage R4 Sre recombinase, a CisA recombinase, an 
XisF recombinase, and a transposon Tn445 1 TnpX recombinase. 



25. The method of claim 18, wherein said recombinase is a <|>C31 integrase. 
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26. The method of Claim 18, wherein said host cell is an epidermal host cell. 

27. The method of Claim 26, wherein said epidermal host cell is a keratinocyte. 

5 
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TTACGCGTGCTAGCCCGGGCTCGATCGAGATCTGCGATCTAAGTAAGCTTATA 

TTCCATGCTAGGGTTCTGGTGTTGGTGCGTGGGGTTGGGGTGGGACTGCAGA 

AGTGCCTTTTAAGATTATGTGATTGACTGATCTGTCATTGGTTCCCTGCCATCT 

TTATCTTTTGGATTCCCCTCGGAGGAGGGGAGGAAGGAGTTTCTTTTGGGTTT 

TATTGAATCAAATGAAAGGGAAAGTAGAGGTGTTCCTATGGAGGGGAGGAA 

GGAGTTTCTTTTGGGTTTTATTGAATCAAATGAAAGGGAAAGTAGAGGTGTTC 

CTATGTCCCGGGCTCCGGAGCTTCTATTCCTGGGCCCTGCATAAGAAGGAGA 

CATGGTGGTGGTGGTGGTGGGTGGGGGTGGTGGGGCACAGAGGAAGCCGAT 

GCTGGGCTCTGCACCCCATTCCCGCTCCCAGATCCCTCTGGATATAGCACCCC 

CTCCAGTGAGCACAGCCTCCCCTTGCCCCACAGCCAACAGCAACATGCCTCC 

CAACAAAGCATCTGTCCCTCAGCCAAAACCCCTGTTGCCTCTCTCTGGGGAA 

ATTGTAGGGCTGGGCCAGGGTGGGGGGACCATTCTCTGCAGGGAGATTAGGA 

GTGTCTGTCAGGGGCGGGTGGAGCGGGGTGGGGCCCTGGCTTACTCACATCC 

TTGAGAGTCCTTTGCTGGCAGATTTGGGGAGCCCACAGCTCAGATGTCTGTCT 

CAGCATTGTCTTCCAAGCTCCTAGGCCACAGTAGTGGGGCGCTCCCTTCTCTG 

GCTTCTTCTTTGGTGACAGTCAAGGTGGGGTTGGGGGTGACGAAGGGTCCTG 

CTTCTCTTCTAGGAGCAGTTGATCCCAGGAAGAGCATTGGAGC.CXCCAG.GAG 

GGGCTGTTGGGGCCTGTCTGAGGAGATAGGATGCGTCAGGCAGCCCCAGACA 

CGATGAGAIXGGIXITXJA^ 

ATGGGAGGGTGGGGTGGGGGCCGGAAGGGTTTGCTTTGGGAGGTTGTCTGGG 

AGATTGCTGAAGTTTTGATATACACACCTCCAAAGCAGGACCAAGTGGACTC 

CTAGAAATGTCCCCTGACCCTTGGGGCTTCAGGAGTCAGGGACCCTCGTGTC 

CACCTCAGCCTTGCCCTTGCACAGCCCAGCTCCACTCCAGCCTCTACTCCTCC 

CCAGAACATCTCCTGGGCCAGTTCCACAAGGGGCTCAAACGAGGGCACCTGA 

GCTGCCCACACTAGGGATGTTCTGGGGGTCTGAGAAGATATCTGGGGCTGGA 

AGAATAAAAGGCCCCCCTAGGCCTGTTCCTGGATGCAGCTCCAGCCACTTTG 

GGGCTAAGCCTGGGCAATAACAATGCCAACGAGGCTTCTTGCCATACTCGGT 

TTACAAAACCCTTTACATACATTGTCGCATTGGATTCTCAGAGCTGACTGCAC 

TAAGCAGAATAGATGGTATGACTCCCACTTTGCAGATGAGAACACTGAGGCT 

CAGAGAAGTGCGAAGCCCTGGGTCACAGAGGCGTAAATGCAGAGCCAGGAC 

CCACCTGAAGACCCACCTGACTCCAGGATGTTTCCTGCCTCCATGAGGCCACC 

TGCCCTATGGTGTGGTGGATGTGAGATCCTCACCATAGGGAGGAGATTAGGG 

TCTGTGCTCAGGGCTGGGGAGAGGTGCCTGGATTTCTCTTTGATGGGGATGTT 

GGGGTGGGAATCACGATACACCTGATCAGCTGGGTGTATTTCAGGGATGGGG 

CAGACTTCTCAGCACAGCACGGCAGGTCAGGCCTGGGAGGGCCCCCCAGACC 

TCCTTGTCTCTAATAGAGGGTCATGGTGAGGGAGGCCTGTCTGTGCCCAAGGT 

GACCTTGCCATGCCGGTGCTTTCCAGCCGGGTATCCATCCCCTGCAGCAGCAG 

GCTTCCTCTACGTGGATGTTAAAGGCCCATTCAGTTCATGGAGAGCTAGCAG 

GTAACTAGGTTTAAGGTGCAGAGGCCCTGCTCTCTGTCACCCTGGCTAAGCCC 

AGTGCGCGGGTTCCTGAGGGCTGGGACTCCCAGGGTCCGATGGGAAAGTGTA 

GCCTGCAGGCCCACACCTCCCCCTGTGAATCACGCCTGGCGGGACAAGGAAG 

CCCAAAACACTCCAAACAATGAGTTTCCAGTAAAATATGACAGACATGATGA 

GGCGGATGAGAGGAGGGACCTGGCTGGGAGTTGGCGCTAGCCTGTGGGTGAT 
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GAAAGCCAAGGGGAATGGAAAGTGCCAGACCCGCCCCCTACCCACGAGTAT 
AAAGCACTCGCATCCCTTTCCAATTTACCCGAGCACCTTCTCTTCACTCAGCC 
AACTGCTCGCTCGCTCACCTCCCTCCTCTGCACCAAGGGCGAATTCCAGCACA 
CTGGCGGCCGTTACTAGTGGATCC 
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Figure 2 

Invohicrin promoter sequence (SEQ ID NO: 2) 

AA GCTTCTCCAT GTGXCAXGGG AXAXGAGCXC AXCCXXAXXA 

1951 XGXXGGGTGG GGGTTGGACA GTTACCCAGA CXXGTCAXGX GGACCTGGAG 
2001 CXXAXGAGGX CATXXACAXA GGCAGXGAAA GAACCXCXCC CAXAXACGXG 
2051 AAXGCCXGXC TCCCAAATGG GGCAACCTGT GGGCAGAATA AGGGACTTCT 
2101 CAGCCCTAGA AXGXXGAGGX XTCCCCAACC CCTCCCTTGC ATACACACAC 
2151 ACACAAACAC TCCCTCAGCT GXATCCACXG XCCXCXXXCC CACACCCXAG 

22 01 CXXXGCCCAG CAGTCAAAGG CXCACACAXA CCAXCXXCXC CTTAAGGCTC 
2251 XXAXXAXGCC GXGAGXCAGA GGGCGGGAGG CAGATGTGGC AGATACTGAG 

23 01 CCCCXGCXAA CCCATAAGAC CGGTGTGACT XCCXXGAXCX GAGXCTGCXG 
23 51 CCCCAGACTG ACXGXCACGG GCXGGGAAGA GGCAGATTCC CCCCAGAXGA 
2401 AGTCAGCAGC AGAGCACAAG GGCATCAGCG CCAAAGTAAG GAXGCTXGAX 
2451 TAGTTCTTCA GGGCAGAGXG GGCXGXGCXX CCTCTGCCCC AGAAAATGGC 
2 501 ACAGXCCCXG XXCXAXGGGA AAAAGAAXGX GAGGTCCCTG GGXGGGCXCA 
2 551 GGGAACAGAG AGGXCATGAG GAGGGGATAG CACTGCAGAA ACGAAGGGTG 
2 601 CCXTGTGAGT CCTCCCTCTG TCTTTTTAGG CAXGAXCCAG GAACATGACA 
2 6 51 AAATTAGTGC XXXAAAXAGA TXXACXTGGG GCTAAGAGAA AXGTGCCXGT 
2 7 01 CAGGAAAACT AXGGGGAAXC AGGAGACTTC TCAAAATTAG CCTCACTGAG 
27 51 TAXXGXCXXX AXAATXCCXX CXTXXXGGAX XAGAXXGXAA AAAAGAGAGX 
2 8 01 GTAAATGAAT GATGTCCATA TAATAAGTTA TTAGCCAACC AXXAAGAAGA 
2 8 51 AAGGGAAGAA AXAAAXCAGX XXGGXXXXXA CACACACAXA CAGACACACA 
2 9 01. CATAXAAACA XXGAXCAACA .CXGAAGTGTT , ..XAAXAGXGAT .X&TTmWCGGG 

2 951 ^XCGXAAAAXT CACXGXXCXX CAATGAATAC XTGTAGAGCA CAXAXXAXAX 

3 0 01 GCAGTAGTTT XGAXAGGTXC XAGGGGTAXA GTGGAAAACA , ,XA CCAGGTAX 
3 0 51 ^ACGCXGCXCX XAGCTXAXTX TTCCAGXGGGA AAGAXAGACA AXAAGCAAGX 
3101 GAACAAAXGC AAAXAAAXTA CXCXAGAXTG XTAXAAGTGA AAXXAAGXAC 
3151 CAAXCCXXTA GAXAXGGTAC ACAGAGAAGG ATCXCXGACA GACCCCAACA 
3201 XXGACACXGA AGCXGAAAGG CAXAAAAGAA CCAGAGACCX GGGGAGGGGT 
3251 CGGXGGGCAG AAGGAGAGCA GGXGCCAAGC CCCCAGGXGG AGAGCXCXGG 
33 01 GCXCAXCXCA GGAACCGAAG GCCCXCAGXG AGGXAAGAAX AXACCXCXCA 
3 351 GGGAGAGATX GACATGAATX GGGGCCCCAG AAGAAGGCAG AAGCCAGGXA 
3401 CCCAGGGXCX XXXAAACCAC GGCAGXGAGX XTGAAXGXXA XXTCAAGXGC 
3451 GCXGGXGGAC XGXXGGCACG GGGGRGAGAX GXGCXCAAAX CCCCACXCXG 
3 501 AAAGAXTXCX XAAGCXAXTT CXAGAGXAXG ATXXACAAGA GGAAAXGGAT 
3 551 GAXTXGAXXC XGAXCXXXAX ACCXTCAXGC ATXXAAAAAA GTACXXAAGA 
3 6 01 AAGXAGXXXG GXXXGXCAXX AXAAAAAGCA ATACXXAXXX XTAXAXXGXG 
3651 XAGATXCAAX CXXGXXXCCX XGCCXAGAGX GGGCCGXGCX XXGGAGXXCX 
3701 TAXGAGCATG GCAXXCCXGA GAACTXCXCX AACXGCAGXC XCGGGGAXAG 
3751 AGGCXGGGCA GCAAGXGGCA GCAGCAGAGG ACXCCTAGAA GCCXXCXACX 
3 801 XGAGXCTACX XGGCCXAAAG XCAAACXCCC XCCACCAAAG ACAGAGXTXA 
3 851 XXXCCACATA GGAXGGAGTX AAAAAAXAXA XXCXGAGAGA GGAAGGGCXX 
3 901 GXGGCCCAAG AGAACACCCC AGAAAXACCA CCCCXXCATG GGAAGXGACX 
3 951 CXAXCXXCAA ACAXAXAACC CAGCCXGGAC ATCCCGGAAA GACACAXAAC 
4001 XXXCCAXTTC AXGCCCTTGA AAGXGAAXCX XTCGGCCXAA XAAXGAGAAC 
40 51 AAACXCAXXT XGAAAGXGGA AAAAXXGAGA XXCAGAGCAG AAGTTXGACX 
4101 AAGGTCACAA AACAGXAGGA XGCCXCACTC AGCXCCCXGX GCCXAGGTCA 
4151 GAAAAGCAXC AC AG G AAX AG XTGAGCXACC AGAAXCCXCX GGCCAGGCAG 
42 01 GAGCXGXGTG XCCCXGGGAA AXGGGGCCCX AAAGGGXXXG CXGCXXAAGA 

42 51 XGCCXGTGGX GAGXCAGGAA GQGG'VTAGAG GAAGXXGACC AACXAGAGXG 

43 01 GXGAAACCXG XCCAXCACCX XCAACCXGGA GGGAGGCCAG GCXGCAGAAX 

43 51 GAXAXAAAGA GXGCCCXGAC XCCXGCTCAG CXCAGCACXC CACCAAAGCC 

44 01 XCXGCCTCAG CCXXACXGTG AGXCXGGXAA GXGXCGGATG GXAGAACCAG 
44 51 GGXXGGGACT CGGGACCTCC AACAGCAXAC GAXGXGGXGG GGGTGGGCAG 
4501 CCXGGGXGGG GGXGGGCATX ACXCXGGGGC XGGAXXCAGC XGGACXXTCA 
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4551 TTCTAGGGGG ACTCGAGTCA GAGTACTGAG AGAAAAGTGC CTTGGCACAG 
46 01 AAGTGCAGAA CAGAGAGTAA TCATCCTATG TCCCATCTTT TCTTGTGACC 

46 51 ATATTTTTGG ATTTGTGTGT GAGAGAGAAT TATGGAAGGG AGGAGGGGAA 

47 01 TAGCATTCAA TTTCTTTCCT AAACCTCTTG GGTTTTGACA GACCATCATT 
47 51 TTGCCTTCTT TATGGAGGGA GAGGTTCAGG GAAGAGCTTC CACCTTCTGG 
4801 CTATGCTGCA CAGAGGGATG GCAGAATGGG GAAACCTTTC TATTTGGAGA 
4851 AACCTAGGCA GAGCTGGGAC AGGAAAACTC AACTTAGAAG TATAAGACTT 
49 01 GGAAGAACAA CCTCCAACTC TCAGCAACCT TCCAGCTCCC GCAGCCCCAC 

49 51 CCCAGACACA AGGACTGCAG CTAAACCTCA GAAGGTCAGG AGAGAAAGCA 

50 01 GCCCTGGGGT TGAATAGGCC AACCTGCTGG CTTTACAGGG GGGAAAACCG 
5051 AATCCCAGGA GACTAAGTGA CATGCCCAGA AAC AC AC AG C ATTCCAATGG 
5101 GAGATTGAGG CCTAGAGCAT GTCCTGTGGC TCCAGTCTGG AGGTCACACC 
5151 ATGACCTCTT AGATCCTCTC TGGCACGGCC TATAGGTTTT CTAGGACTTG 
5201 GTGTTCTCCA AGAGACATTT CATTCCCTAA GGCCTTACTC CTCACTGTGA 

52 51 CATAATCCCA GAACGCATCT CTGCTCCTTG GTCAGTGAAG CGATGAGGGT 

53 01 GGACACAAGG ACTAGACAAG AGCAGACAGT GAGCTGGCAC CTGACCCACC 
53 51 CTTGCAGAAC AGCCCTGCAG ACAGATCTCC TTGTTGGCTC TCACCTGGGA 
5401 ACAAGGAGGC TCCTAGGAGG ACCTTTCTCT GCCCCTCCAC ATTTCCACCC 
5451 TTCTCTCTCT GCTGCTTTTG GGAAATGGTA GTCCAGAGGT GGTAGGACAG 
5501 TACCCTGCCC AAGGGAAGAG GGGATGCTAA AAAACCAGAT ACTTCTGCAG 
5 5 51 ATTCCCAAGG TTTCATCTAT TTCCTTTGCC TTCAGCCTGT GCATCAGACC 
5 6 01 TCTTCTGTCT TTCAGGTTGA CAGTAGCTTC TAAGCCCGGG 
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Figure 4 

VEGF sequence (SEQ ID NO: 3) 

CCATGAACTTTCTGCTGTCTTGGGTGCATTGGAGCCTTGCCTTGCTGCTCTACC 

TCCACCATGCCAAGTGGTCCCAGGCTGCACCCATGGCAGAAGGAGGAGGGC 

AGAATCATCACGAAGTGGTGAAGTTCATGGATGTCTATCAGCGCAGCTACTG 

CCATCCAATCGAGACCCTGGTGGACATCTTCCAGGAGTACCCTGATGAGATC 

GAGTACATCTTCAAGCCATCCTGTGTGCCCCTGATGCGATGCGGGGGCTGCTG 

CAATGACGAGGGCCTGGAGTGTGTGCCCACTGAGGAGTCCAACATCACCATG 

CAGATTATGCGGATCAAACCTCACCAAGGCCAGCACATAGGAGAGATGAGCT 

TCCTACAGCACAACAAATGTGAATGCAGACCAAAGAAAGATAGAGCAAGAC 

AAGAAAATCCCTGTGGGCCTTGCTCAGAGCGGAGAAAGCATTTGTTTGTACA 

AGATCCGCAGACGTGTAAATGTTCCTGCAAAAACACAGACTCGCGTTGCAAG 

GCGAGGCAGCTTGAGTTAAACGAACGTACTTGCAGATGTGACAAGCCGAGGC 

GGTGAGCCGGGCAGGAGGAAGGAGCCTCCCTCAGGGTTTCGGGAACCAGAT 

CTCTCACCAGGAAAGACTGATACAGAACGATCGA 
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Figure 5 

Full-length HS4 insulator (SEQ ID NO: 4) 

GGGGAGCTCACGGGGACAGCCCCCCCCCAAAGCCCCCAGGGATGTAATTACG 

TCCCTCCCCCGCTAGGGGGCAGCAGCGAGCCGCCCGGGGCTCCGCTCCGGTC 

CGGCGCTCCCCCCGCATCCCCGAGCCGGCAGCGTGCGGGGACAGCCCGGGCA 

CGGGGAAGGTGGCACGGGATCGCTTTCCTCTGAACGCTTCTCGCTGCTCTTTG 

AGCCTGCAGACACCTGGGGGATACGGGGAAAAAGCTTTAGGCTGAAAGAGA 

GATTTAGAATGACAGAATCATAGAACNGCCTGGGTTGCAAAGGAGCACAGTG 

CTCATCCAGATCCAACCCCCTGCTATGTGCAGGNNTCATCAACCAGCAGCCC 

AGCGCGTCAGAGCCACATCCAGCCTGGCCTTGAATGCCTGCCTGCAGGGATG 

GGGCATCCACAGCCTCCTTGGGCAACCTGTTCAGTGCGTCACCACCCTCTGGG 

GAAAAACTGCCTCCTCATATCCAACCCAAACCTCCCCTGTCTCAGTGTAAAGC 

CATTCCCCCTTGTCCTATCAAGGGGGAGTTTGCTGTGACATTGTTGGTCTGGG 

GTGACACATGTTTGCCAATTCAGTGCTCACGGAGAGGCAGATCTTGGGATAA 

GGAAGTGCAGGACAGCATGGACGTGGACATGCAGGTGTTGAGGCTCTGGAC 

ACTCCAAGTCACAGCGTTCAGAACAGCCTTAAGGTCAAGAAGATAGGATAGA 

AGGACAAAGAGCAAGTTAAAACCCAGCATGGAGAGGAGCACAAAAAGGCCA 

CAGACACTGCTGGTCCCTGTGTCTGAGCCTGCATGT.TTGATGGTGTCTGGATG 

CAAGCAGAKGGGGTGGAAGAGCTTGCCTGGAGAGATACAGGCTGGGTCGTA 

GGACTGGGACAGGCAGCTGGAGAATTGCCATGTAGATGTICAIACAATCGTC 

AAATCATGAAGGCTGGAAAAGNNCTCCAAGATCCCCAAGACCAACCCCAAC 

CCACCCACCGTGCCACTGGCCATGTCCCTCAGTGCCACATCCCCACAGTTCTT 

CATCACCTCCAGGGACGGTGACNCNCNCCTCCTCCGTGGCAGCTGTGCCACT 

GCAGCACCGCTCTTTGGAGAAGGTAAATCTTGCTAAATCCAGCCCGACCCTC 

CCCTGGCACAACGTAAGGCCATTATCTCTCATCCAACTCCAGGACGGAGTCA 

GTGAGAATATT 

HS4 core (SEQ ID NO: 5) 

ATCGGGGAGCTCACGGGGACAGCCCCCCCCCAAAGCCCCCAGGGATGTAATT 

ACGTCCCTCCCCCGCTAGGGGGCAGCAGCGAGCCGCCCGGGGCTCCGCTCCG 

GTCCGGCGCTCCCCCCGCATCCCCGAGCCGGCAGCGTGCGGGGACAGCCCGG 

GCACGGGGAAGGTGGCACGGGATCGCTTTCCTCTGAACGCTTCTCGCTGCTCT 

TTGAGCCTGCAGACACCTGGGGGATACGGGGAAAAAGAT 

Dimmer (SEQ ID NO: 6) 

ATCGGGGAGCTCACGGGGACAGCCCCCCCCCAAAGCCCCCAGGGATGTAATT 

ACGTCCCTCCCCCGCTAGGGGGCAGCAGCGAGCCGCCCGGGGCTCCGCTCCG 

GTCCGGCGCTCCCCCCGCATCCCCGAGCCGGCAGCGTGCGGGGACAGCCCGG 

GCACGGGGAAGGTGGCACGGGATCGCTTTCCTCTGAACGCTTCTCGCTGCTCT 

TTGAGCCTGCAGACACCTGGGGGATACGGGGAAAAAGATATCGGGGAGCTC 

ACGGGGACAGCCCCCCCCCAAAGCCCCCAGGGATGTAATTACGTCCCTCCCC 

CGCTAGGGGGCAGCAGCGAGCCGCCCGGGGCTCCGCTCCGGTCCGGCGCTCC 
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CCCCGCATCCCCGAGCCGGCAGCGTGCGGGGACAGCCCGGGCACGGGGAAG 
GTGGCACGGGATCGCTTTCCTCTGAACGCTTCTCGCTGCTCTTTGAGCCTGCA 
GACACCTGGGGGATACGGGGAAAAAGAT 
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Figure 6 

attB recombination site (SEQ ID NO: 7) 

GTCGACGATGTAGGTCACGGTCTCGAAGCCGCGGTGCGGGTGCCAGGGCGTG 

CCCTTGGGCTCCCCGGGCGCGTACTCCACCTCACCCATCTGGTCCATCATGAT 

GAACGGGTCGAGGTGGCGGTAGTTGATCCCGGCGAACGCGCGGCGCACCGG 

GAAGCCCTCGCCCTCGAAACCGCTGGGCGCGGTGGTCACGGTGAGCACGGGA 

CGTGCGACGGCGTCGGCGGGTGCGGATACGCGGGGCAGCGTCAGCGGGTTCT 

CGACGGTCACGGCGGGCATGTCGAC 
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Figure 7 

Complete vector sequence (SEQ ID NO: 8) 

1 GACCCCGTAG AAAAGATCAA AGGATCTTCT TGAGATCCTT TTTTTCTGCG 
51 CGTAATCTGC TGCTTGCAAA CAAAAAAACC ACCGCTACCA GCGGTGGTTT 
101 GTTTGCCGGA TCAAGAGCTA CCAACTCTTT TTCCGAAGGT AACTGGCTTC 
151 AGCAGAGCGC AGATACCAAA TACTGTTCTT CTAGTGTAGC CGTAGTTAGG 
201 CCACCACTTC AAGAACTCTG TAGCACCGCC TACATACCTC GCTCTGCTAA 
251 TCCTGTTACC AGTGGCTGCT GCCAGTGGCG ATAAGTCGTG TCTTACCGGG 
301 TTGGACTCAA GACGATAGTT ACCGGATAAG GCGCAGCGGT CGGGCTGAAC 
351 GGGGGGTTCG TGCACACAGC CCAGCTTGGA GCGAACGACC TACACCGAAC 
401 TGAGATACCT ACAGCGTGAG CATTGAGAAA GCGCCACGCT TCCCGAAGGG 
451 AGAAAGGCGG ACAGGTATCC GGTAAGCGGC AGGGTCGGAA CAGGAGAGCG 
501 CACGAGGGAG CTTCCAGGGG GAAACGCCTG GTATCTTTAT AGTCCTGTCG 
551 GGTTTCGCCA CCTCTGACTT GAGCGTCGAT TTTTGTGATG CTCGTCAGGG 
601 GGGCGGAGCC TATGGAAAAA CGCCAGCAAC GCGGCCTTTT TACGGTTCCT 
651 GGCCTTTTGC TGGCCTTTTG CTCACATGCT GGGCCCAGCC GGCCAGATCT 
701 GAGCTCTTAC GCGTGCTAGC CCGGGCTCGA TCGAGATCTG CGATCTAAGT 
7 51 AAGCTTATAT TCCATGCTAG GGTTCTGGTG TTGGTGCGTG GGGTTGGGGT 
801 GGGACTGCAG AAGTGCCTTT TAAGATTATG TGATTGACTG ATCTGTCATT 
851 GGTTCCCTGC CATCTTTATC TTTTGGATTC CCCTCGGAGG AGGGGAGGAA 
9 01 GGAGTTTCTT TTGGGTTTTA TTGAATCAAA TGAAAGGGAA AGTAGAGGTG 
9 51 TTCCTATGGA GGGGAGGAAG GAGTTTCTTT TGGGTTTTAT TGAATCAAAT 

10 01 GAAAGGGAAA GTA<£AGGTGX , JXCCXATGTCC MGGGCTCCG&^AGCTXCTATP 
^10T1~ CCTGGGCCCT GCATAAGAAG GAGACATGGT GGTGGTGGTG GTGGGTGGGG 

1101 GTGGTGGGGC ACAGAGGAAG CCGATGCTGG GCTCTGCACC CCATTCCCGC 
■ 1151 -TCCCAGATCC > 'CTGTGGATAT. 'AGCAX^CCCCT ^CCAGTOAGCA -CA^CCTCCCC 

12 01 TTGCCCCACA GCCAACAGCA ACATGCCTCC CAACAAAGCA TCTGTCCCTC 

12 51 AGGCAAAACC CCTGTTGCCT CTCTCTGGGG AAATTGTAGG GCTGGGCCAG 

13 01 GGTGGGGGGA CCATTCTCTG CAGGGAGATT AGGAGTGTCT GTCAGGGGCG 

13 51 GGTGGAGCGG GGTGGGGCCC TGGCTTACTC ACATCCTTGA GAGTCCTTTG 

14 01 CTGGCAGATT TGGGGAGCCC ACAGCTCAGA TGTCTGTCTC AGCATTGTCT 
1451 TCCAAGCTCC TAGGCCACAG TAGTGGGGCG CTCCCTTCTC TGGCTTCTTC 

15 01 TTTGGTGACA GTCAAGGTGG GGTTGGGGGT GACGAAGGGT CCTGCTTCTC 
1551 TTCTAGGAGC AGTTGATCCC AGGAAGAGCA TTGGAGCCTC CAGCAGGGGC 
1601 TGTTGGGGCC TGTCTGAGGA GATAGGATGC GTCAGGCAGC CCCAGACACG 
1651 ATCACATTCC TCTCAACATG CCTGCCGGGG TCTGTGGAGC CGAGGGGCTG 
17 01 ATGGGAGGGT GGGGTGGGGG CCGGAAGGGT TTGCTTTGGG AGGTTGTCTG 
1751 GGAGATTGCT GAAGTTTTGA TATACACACC TCCAAAGCAG GACCAAGTGG 
1801 ACTCCTAGAA ATGTCCCCTG ACCCTTGGGG CTTCAGGAGT CAGGGACCCT 
1851 CGTGTCCACC TCAGCCTTGC CCTTGCACAG CCCAGCTCCA CTCCAGCCTC 
1901 TACTCCTCCC CAGAACATCT CCTGGGCCAG TTCCACAAGG GGCTCAAACG 
1951 AGGGCACCTG AGCTGCCCAC ACTAGGGATG TTCTGGGGGT CTGAGAAGAT 
2001 ATCTGGGGCT GGAAGAATAA AAGGCGCCCC TAGGCCTGTT CCTGGATGCA 
2051 GCTCCAGCCA CTTTGGGGCT AAGCCTGGGC AATAACAATG CCAACGAGGC 
2101 TTCTTGCCAT ACTCGGTTTA CAAAACCCTT TACATACATT GTCGCATTGG 
2151 ATTCTCAGAG CTGACTGCAC TAAGCAGAAT AGATGGTATG ACTCCCACTT 
22 01 TGCAGATGAG AACACTGAGG CTCAGAGAAG TGCGAAGCCC TGGGTCACAG 

22 51 AGGCGTAAAT GCAGAGCCAG GACCCACCTG AAGACCCACC TGACTCCAGG 

23 01 ATGTTTCCTG CCTCCATGAG GCCACCTGCC CTATGGTGTG GTGGATGTGA 

23 51 GATCCTCACC ATAGGGAGGA GATTAGGGTC TGTGCTCAGG GCTGGGGAGA 

24 01 GGTGCCTGGA TTTCTCTTTG ATGGGGATGT TGGGGTGGGA ATCACGATAC 

24 51 ACCTGATCAG CTGGGTGTAT TTCAGGGATG GGGCAGACTT CTCAGCACAG 

25 01 CACGGCAGGT CAGGCCTGGG AGGGCCCCCC AGACCTCCTT GTCTCTAATA 
2 5 51 GAGGGTCATG GTGAGGGAGG CCTGTCTGTG CCCAAGGTGA CCTTGCCATG 
2 6 01 CCGGTGCTTT CCAGCCGGGT ATCCATCCCC TGCAGCAGCA GGCTTCCTCT 
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2 6 51 ACGTGGATGT TAAAGGCCCA TTCAGTTCAT GGAGAGCTAG CAGGAAACTA 
27 01 GGTTTAAGGT GCAGAGGCCC TGCTCTCTGT CACCCTGGCT AAGCCCAGTG 
27 51 CGTGGGTTCC TGAGGGCTGG GACTCCCAGG GTCCGATGGG AAAGTGTAGC 
2 8 01 CTGCAGGCCC ACACCTCCCC CTGTGAATCA CGCCTGGCGG GACAAGAAAG 
2 8 51 CCCAAAACAC TCCAAACAAT GAGTTTCCAG TAAAATATGA CAGACATGAT 

2 901 GAGGCGGATG AGAGGAGGGA CCTGCCTGGG AGTTGGCGCT AGCCTGTGGG 
2951 TGATGAAAGC CAAGGGGAAT GGAAAGTGCC AGACCCGCCC CCTACCCATG 

3 001 AGTATAAAGC ACTCGCATCC CTTTGCAATT TACCCGAGCA CCTTCTCTTC 
30 51 ACTCAGCCTT CTGCTCGCTC GCTCACCTCC CTCCTCTGCA CCAAGGGCGA 
3101 ATTCCAGCAC ACTGGCGGCC GTTACTAGTG GATCCGAGCT CGCGGCCGCG 
3151 AT AT C G C TAG CTCGAGGAGA ACTTCAGGGT GAGTTTGGGG ACCCTTGATT 
3201 GTTCTTTCTT TTTCGCTATT GTAAAATTCA TGTTATATGG AGGGGGCAAA 
3 2 51 GTTTTCAGGG TGTTGTTTAG AATGGGAAGA TGTCCCTTGT ATCACCATGG 
3 3 01 ACCCTGATGA TAATTTTGTT TCTTTCACTT TCTACTCTGT TGACAACCAT 
3 3 51 TGTCTCCTCT TATTTTCTTT TCATTTTCTG TAACTTTTTT CGTTAAACTT 
3 401 TAGCTTGCAT TTGTAACGAA TTTTTAAATT CACTTTCGTT TATTTGTCAG 
3 451 ATTGTAAGTA CTTTCTCTAA TCACTTTTTT TTCAAGGCAA TCAGGGTAAT 
3 5 01 TATATTGTAC TTCAGCACAG TTTTAGAGAA CAATTGTTAT AATTAAATGA 
3 551 TAAGGTAGAA TATTTCTGCA TATAAATTCT GGCTGGCGTG GAAATATTCT 
3 6 01 TATTGGTAGA AACAACTACA TCCTGGTAAT CATCCTGCCT TTCTCTTTAT 
3 6 51 GGTTACAATG ATATACACTG TTTGAGATGA GGATAAAATA CTCTGAGTCC 
3 7 01 AAACCGGGCC CCTCTGCTAA CCATGTTCAT GCCTTCTTCT TTTTCCTACA 
3 7 51 GCTCCTGGGC AACGTGCTGG TTGTTGTGCT GTCTCATCAT TTTGGCAAAG 
3 8 01 AATTCACTCC TCAGGTGCAG GCTGCCTATC AGAAGGTGGT GGCTGGTGTG 

3 9 01 AAATTATGGG GACATCATGA AGCCCCTTGA GCATCTGACT TCTGGGTAAT 

3 9 51 AAAGGAAATT TATTTTCATT GCAATAGTGT GX.GGGAATTT. .XTTGTjGTCTC 
4'001 1< TCA'CTCGGAA X3GACATATGG GAGGGCAAAT CATTTAAAAC ATCAGAATGA 

4 0 51 GTATTTGGTT TAGAGTTTGG CAACATATGC CATATGCTGG CTGCCATGAA 
4101 CAAAGGTGGC TATAAAGAGG TCATCAGTAT ATGAAACAGC CCCCTGCTGT 
4151 CCATTCCTTA TTCCATAGAA AAGCCTTGAC TTGAGGTTAG ATTTTTTTTA 
42 01 TATTTTGTTT TGTGTTATTT TTTTCTTTAA CATC CCT AAA ATTTTCCTTA 

42 51 CATGTTTTAC TAGCCAGATT TTTCCTCCTC TCCTGACTAC TCCCAGTCAT 

43 01 AGCTGTCCCT CTTCTCTTAT GAACTCGACT GGTCGAGATC GGGAGATCTG 
43 51 GCCTCCGCGC CGGGTTTTGG CGCCCCCCGC GGGCGCCCCC TCCTCACGGC 
4401 GAGCGCTGCC ACGTCAGACG AAGGGCGCAC GAGCGTCCTG ATCCTTCCGC 
4451 CCGGACGCTC AGGACAGCGG CCCGCTGCTC ATAAGACTCG GCCTTAGAAC 

45 01 CCCAGTATCA GCAGAAGGAC ATTTTAGGAC GGGACTTGGG TGACTCTAGG 
4551 GCACTGGTTT TCTTTCCAGA GAGCGGAACA GGCGAGGAAA AGTAGTCCCT 

46 01 TCTCGGCGAT TCTGCGGAGG GATCTCCGTG GGGCGGTGAA CGCCGATGAT 

46 51 TATATAAGGA CGCGCCGGGT GTGGCACAGC TAGTTCCGTC GCAGCCGGGA 

47 01 TTTGGGTCGC GGTTCTTGTT TGTGGATCGC TGTGATCGTC ACTTGGTGAG 

47 51 TAGCGGGCTG CTGGGCTGGC CGGGGCTTTC GTGGCCGCCG GGCCGCTCGG 

48 01 TGGGACGGAA GCGTGTGGAG AGACCGCCAA GGGCTGTAGT CTGGGTCCGC 

48 51 GAGCAAGGTT GCCCTGAACT GGGGGTTGGG GGGAGCGCAG CAAAATGGCG 

49 01 GCTGTTCCCG AGTCTTGAAT GGAAGACGCT TGTGAGGCGG GCTGTGAGGT 
4951 CGTTGAAACA AGGTGGGGGG CATGGTGGGC GGCAAGAACC CAAGGTCTTG 

50 01 AGGCCTTCGC TAATGCGGGA AAGCTCTTAT TCGGGTGAGA TGGGCTGGGG 
5051 CACCATCTGG GGACCCTGAC GTGAAGTTTG TCACTGACTG GAGAACTCGG 
5101 TTTGTCGTCT GTTGCGGGGG CGGCAGTTAT GGCGGTGCCG TTGGGCAGTG 
5151 CACCCGTACC TTTGGGAGCG CGCGCCCTCG TCGTGTCGTG ACGTCACCCG 
52 01 TTCTGTTGGC TTATAATGCA GGGTGGGGCC ACCTGCCGGT AGGTGTGCGG 

52 51 TAGGCTTTTC TCCGTCGCAG GACGCAGGGT TCGGGCCTAG GGTAGGCTCT 

53 01 CCTGAATCGA CAGGCGCCGG ACCTCTGGTG AGGGGAGGGA TAAGTGAGGC 

53 51 GTCAGTTTCT TTGGTCGGTT TTATGTACCT ATCTTCTTAA GTAGCTGAAG 

54 01 CTCCGGTTTT GAACTATGCG CTCGGGGTTG GCGAGTGTGT TTTGTGAAGT 
54 51 TTTTTAGGCA CCTTTTGAAA TGTAATCATT TGGGTCAATA TGTAATTTTC 
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5501 AGTGTTAGAC TAGTAAATTG TCCGCTAAAT TCTGGCCGTT TTTGGCTTTT 
5551 TTGTTAGACC GGACCGTGTT GACAATTAAT CATCGGCATA GTATATCGGC 
5601 ATAGTATAAT ACGACAAGGT GAGGAACTAA ACCATGGCCA AGCCTTTGTC 

56 51 TCAAGAAGAA TCCACCCTCA TTGAAAGAGC AACGGCTACA ATCAACAGCA 
5701 TCCCCATCTC TGAAGACTAC AGCGTCGCCA GCGCAGCTCT CTCTAGCGAC 

57 51 GGCCGCATCT TCACTGGTGT CAATGTATAT CATTTTACTG GGGGACCTTG 
5801 TGCAGAACTC GTGGTGCTGG GCACTGCTGC TGCTGCGGCA GCTGGCAACC 
5851 TGACTTGTAT CGTCGCGATC GGAAATGAGA ACAGGGGCAT CTTGAGCCCC 
5901 TGCGGACGGT GCCGACAGGT GCTTCTCGAT CTGCATCCTG GGATCAAAGC 
59 51 CATAGTGAAG GACAGTGATG GACAGCCGAC GGCAGTTGGG ATTCGTGAAT 
6001 TGCTGCCCTC TGGTTATGTG TGGGAGGGCT AAGCACTTCG TGGCCGAGGA 
6051 GCAGGACTGA CACTCGACCT CGAAACTTGT TTATTGCAGC TTATAATGGT 
6101 TACAAATAAA GCAATAGCAT CACAAATTTC ACAAATAAAG CATTTTTTTC 
6151 ACTGCATTCT AGTTGTGGTT TGTCCAAACT CATCAATGTA TCTTATCATG 

62 01 TCTGAATTCC CGGGGATCCT CTAGATGCAT GCTCGAGCGG CCAATTCGGC 
6251 TTATCGGGGA GCTCACGGGG ACAGCCCCCC CCCAAAGCCC CCAGGGATGT 

63 01 AATTACGTCC CTCCCCCGCT AGGGGGCAGC AGCGAGCCGC CCGGGGCTCC 
63 51 GCTCCGGTCC GGCGCTCCCC CCGCATCCCC GAGCCGGCAG CGTGCGGGGA 
6401 CAGCCCGGGC ACGGGGAAGG TGGCACGGGA TCGCTTTCCT CTGAACGCTT 
6451 CTCGCTGCTC TTTGAGCCTG CAGACACCTG GGGGATACGG GGAAAAGATA 
6501 TCGGGGAGCT CACGGGGACA GCCCCCCCCC AAAGCCCCCA GGGATGTAAT 
6551 TACGTCCCTC CCCCGCTAGG GGGCAGCAGC GAGCCGCCCG GGGCTCCGCT 
6 6 01 CCGGTCCGGC GCTCCCCCCG CATCCCCGAG CCGGCAGCGT GCGGGGACAG 
6 6 51 CCCGGGCACG GGGAAGGTGG CACGGGATCG CTTTCCTCTG AACGCTTCTC 

67 51 CGAATTGGCC GCCAGTGTGA TGGATATCTG CAGAATTCGC CCTTGTCGAC 
6 8 01 GATGTAGGTC ACGGTCTCGA JlGCCGCGGTG. CGGGTGCC&G . GGCGTGCCC^ 
6 851 TGGGCTCCCC GGGCGCGTAC TCCACCTCAC CCATCTGGTC CATCATG AT G 
6901 AACGGGTCGA GGTGGCGGTA GTTGATCCCG GCGAACGCGC GGCGCACCGG 

6 951 GAAGCCCTCG CCCTCGAAAC CGCTGGGCGC GGTGGTCACG GTGAGCACGG 

7 0 01 GACGTGCGAC GGCGTCGGCG GGTGCGGATA CGCGGGGCAG CGTCAGCGGG 
7 0 51 TTCTCGACGG TCACGGCGGG CATGTCGACA AGGGCGAATT CCAGCACACT 
7101 GGCGGCCGTT ACTAGTGGAT CAATTCGGCT TATCGGGGAG CTCACGGGGA 
7151 CAGCCCCCCC CCAAAGCCCC CAGGGATGTA ATTACGTCCC TCCCCCGCTA 
72 01 GGGGGCAGCA GCGAGCCGCC CGGGGCTCCG CTCCGGTCCG GCGCTCCCCC 

72 51 CGCATCCCCG AGCCGGCAGC GTGCGGGGAC AGCCCGGGCA CGGGGAAGGT 
7 3 01 GGCACGGGAT CGCTTTCCTC TGAACGCTTC TCGCTGCTCT TTGAGCCTGC 

73 51 AGACACCTGG GGGATACGGG GAAAAGATAT CGGGGAGCTC ACGGGGACAG 
7401 CCCCCCCCCA AAGCCCCCAG GGATGTAATT ACGTCCCTCC CCCGCTAGGG 

74 51 GGCAGCAGCG AGCCGCCGGG GGCTCCGCTC CGGTCCGGCG CTCGCCCCGC 
7 501 ATCCCCGAGC CGGCAGCGTG CGGGGACAGC CGGGGCAGGG GGAAGGTGGC 
7551 ACGGGATCGC TTTCCTCTGA ACGCTTCTCG CTGCTCTTTG AGCCTGCAGA 
7 601 CACCTGGGGG ATACGGGGAA AAGATAAGCC GAATTGATCC GAGCTCGGTA 
7 651 CCAAGCTTCG ACCTGCAGG 
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